search
HomeTechnology peripheralsAIChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang University's hot new paper, the HuggingGPT project has been open source

The AI ​​craze triggered by ChatGPT has also "burned" the financial circle.

Recently, researchers at Bloomberg have also developed a GPT in the financial field—Bloomberg GPT, with 50 billion parameters.

The emergence of GPT-4 has given many people a taste of the powerful capabilities of large language models.

#However, OpenAI is not open. Many people in the industry have begun to clone GPT, and many ChatGPT replacement models are built on open source models, especially the Meta open source LLMa model.

#For example, Stanford's Alpaca, UC Berkeley teamed up with CMU, Stanford and other Vicuna, Dolly of the startup Databricks, etc.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Various ChatGPT-like large-scale language models built for different tasks and applications present a hundred schools of thought in the entire field. potential.

So the question is, how do researchers choose an appropriate model, or even multiple models, to complete a complex task?

Recently, the research team from Microsoft Research Asia and Zhejiang University released HuggingGPT, a large model collaboration system.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

##Paper address: https://arxiv.org/pdf/2303.17580.pdf

HuggingGPT uses ChatGPT as a controller to connect various AI models in the HuggingFace community to complete multi-modal complex tasks.

This means that you will have a kind of super magic. Through HuggingGPT, you can have multi-modal capabilities, including pictures, videos, and voices. .

HuggingGPT Bridge

Researchers pointed out that solving the current problems of large language models (LLMs) may be the first step towards AGI. It is also a critical step.

Because the current technology of large language models still has some shortcomings, there are some pressing challenges on the road to building AGI systems.

- Limited by the input and output forms of text generation, current LLMs lack the ability to process complex information (such as vision and speech);

- In actual application scenarios, some complex tasks usually consist of multiple subtasks, so the scheduling and collaboration of multiple models are required, which is also beyond the capabilities of the language model;

- For some challenging tasks, LLMs show excellent results in zero-sample or few-sample settings, but they are still weaker than some experts (such as fine-tuned models).

To handle complex AI tasks, LLMs should be able to coordinate with external models to leverage their capabilities. Therefore, the key point is how to choose the appropriate middleware to bridge LLMs and AI models.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Researchers found that each AI model can be expressed in a language form by summarizing its model functions.

Thus, a concept is introduced, "Language is LLMs, namely ChatGPT, a universal interface to connect artificial intelligence models."

By incorporating the AI ​​model description into the prompts, ChatGPT can be considered the brain that manages the AI ​​model. Therefore, this method allows ChatGPT to call external models to solve practical tasks.

To put it simply, HuggingGPT is a collaboration system, not a large model.

Its function is to connect ChatGPT and HuggingFace to process input in different modalities and solve many complex artificial intelligence tasks.

So, every AI model in the HuggingFace community has a corresponding model description in the HuggingGPT library and is integrated into the prompt to build a ChatGPT connection.

HuggingGPT then uses ChatGPT as the brain to determine the answer to the question.

So far, HuggingGPT has integrated hundreds of models on HuggingFace around ChatGPT, covering text classification, target detection, semantic segmentation, image generation, 24 tasks including Q&A, text-to-speech, and text-to-video.

Experimental results prove that HuggingGPT has the ability to handle multi-modal information and complex artificial intelligence tasks.

Four-step workflow

HuggingGPT entire workflow It can be divided into the following four stages:

-Task planning: ChatGPT parses user requests, breaks them into multiple tasks, and plans the task sequence based on its knowledge and dependencies

- Model selection: LLM assigns the parsed tasks to expert models based on the model description in HuggingFace

-Task execution: The expert model executes the assigned task on the inference endpoint and records the execution information and inference results into LLM

- Response generation: LLM summarizes the execution process log and inference results, and returns the summary to the user

Multi-modal capabilities, with

Experimental settings

In the experiment, the researcher used gpt-3.5-turbo and text-davinci-003 Variants of GPT models serve as Large Language Models (LLMs), which are publicly accessible through the OpenAI API.

#In order to make the output of LLM more stable, we set the decoding temperature to 0.

#At the same time, in order to adjust the output of LLM to conform to the expected format, we set logit_bias to 0.1 on the format constraint.

The researchers provide detailed tips designed for the mission planning, model selection, and reaction generation phases in the following table, where {{variable}} represents Before the prompt is entered into the LLM, the field values ​​need to be filled in with the corresponding text.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Researchers tested HuggingGPT on a wide range of multi-modal tasks.

With the cooperation of ChatGP and expert models, HuggingGPT can solve tasks in multiple modes such as language, image, audio and video, including detection, generation, classification and question answering. Task.

#Although these tasks may seem simple, mastering the basic capabilities of HuggingGPT is a prerequisite for solving complex tasks.

For example, visual question and answer task:

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

# #Text generation:

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

文生图:

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

HuggingGPT can integrate multiple input contents to perform simple reasoning. It can be found that even if there are multiple task resources, HuggingGPT can decompose the main task into multiple basic tasks, and finally integrate the inference results of multiple models to obtain the correct answer.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

In addition, the researchers evaluated the effectiveness of HuggingGPT in complex task situations through tests.

# demonstrated HuggingGPT’s ability to handle multiple complex tasks.

When processing multiple requests, they may contain multiple implicit tasks or require multiple aspects of information. In this case, relying on an expert model to solve the problem is not enough.

#HuggingGPT can organize the collaboration of multiple models through task planning.

A user request may explicitly contain multiple tasks:

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

The figure below shows HuggingGPT’s ability to handle complex tasks in multi-turn dialogue scenarios.

Users divide a complex request into several steps and reach the final goal through multiple rounds of requests. It was found that HuggingGPT can track the situation status of user requests through dialogue situation management in the task planning stage, and can well solve the requested resources and task planning mentioned by users.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

"Jarvis" open source

Currently, this project has been open sourced on GitHub. But the code has not been fully released.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Interestingly, the researchers named this project Jarvis in "Iron Man", the invincible AI Here it comes.

JARVIS: A system connecting LLMs and the ML community

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

By the way, HuggingGPT requires the OpenAI API to be used.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Netizen: The future of research

JARVIS / HuggingGPT is just like the Toolformer proposed by Meta before. They are all acting as connectors.

#Even, including ChatGPT plugins.

Netizens said, "I strongly suspect that the first artificial general intelligence (AGI) will appear earlier than expected. It will rely on "glue" artificial intelligence , able to intelligently glue together a series of narrow artificial intelligence and practical tools.

#I was given access to the plug-in, which transformed it from a math noob to a math genius overnight. Of course, this is only a small step, but it is a sign of future development trends.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

I predict that in the next year or so we will see an AI assistant that is Dozens of large language models (LLMs) and similar tools are connected, and end users simply give instructions to their assistants to complete tasks for them. This sci-fi moment is coming.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

Some netizens said that this is the future research method.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

GPT In front of a lot of tools, you know how to use them.

ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang Universitys hot new paper, the HuggingGPT project has been open source

The above is the detailed content of ChatGPT can choose models by itself! Microsoft Asia Research Institute + Zhejiang University's hot new paper, the HuggingGPT project has been open source. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
How to Build Your Personal AI Assistant with Huggingface SmolLMHow to Build Your Personal AI Assistant with Huggingface SmolLMApr 18, 2025 am 11:52 AM

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityAI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityApr 18, 2025 am 11:49 AM

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentThe 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentApr 18, 2025 am 11:44 AM

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Comprehensive Guide to Python Built-in Data Structures - Analytics VidhyaComprehensive Guide to Python Built-in Data Structures - Analytics VidhyaApr 18, 2025 am 11:43 AM

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

First Impressions From OpenAI's New Models Compared To AlternativesFirst Impressions From OpenAI's New Models Compared To AlternativesApr 18, 2025 am 11:41 AM

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

AI Portfolio | How to Build a Portfolio for an AI Career?AI Portfolio | How to Build a Portfolio for an AI Career?Apr 18, 2025 am 11:40 AM

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

What Agentic AI Could Mean For Security OperationsWhat Agentic AI Could Mean For Security OperationsApr 18, 2025 am 11:36 AM

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Google Versus OpenAI: The AI Fight For StudentsGoogle Versus OpenAI: The AI Fight For StudentsApr 18, 2025 am 11:31 AM

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment