Home  >  Article  >  Technology peripherals  >  Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

PHPz
PHPzforward
2023-10-17 14:29:05816browse

Recently, the leading domestic artificial intelligence large model company wall-facing intelligence has made another big move, and jointly developed and launched the large model with Tsinghua University NLP Laboratory Superhero”——XAgent.

Through the task test, XAgent’s processing capabilities in real complex tasks have completely surpassed AutoGPT.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent


XAgent completely surpasses AutoGPT in real complex task processing

  • is now officially open source on GitHub, the address is https://github.com/OpenBMB/XAgent
  • Case display address: https ://x-agent.net/
  • Blog address: https://blog.x-agent.net

XAgent What kind of "person" is this?

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

## AI agents, with LLM as the core, can understand human instructions, formulate complex plans and take actions autonomously.

Traditional agents are usually restricted by human-customized rules and can only solve problems within a limited range. They are more like "tools" for human use, rather than true "autonomous agents", and are difficult to solve complex problems autonomously.

In contrast, XAgent is endowed with autonomous planning and decision-making capabilities, allowing it to operate independently and discover new strategies and solutions without human presets of bondage.

Its capabilities have completely surpassed AutoGPT, showing amazing autonomy and complex task solving capabilities in many scene tasks, raising the intelligence level of AI agents to a whole new level high.

Then the question comes again: how is it implemented?

"Left and right brain" collaboration, double cycle mechanism

Just like humans have "left brain" and "right brain", they can handle complex tasks It is usually considered from two perspectives: "macro" and "micro". It is necessary to coordinate and plan for the overall situation, and also consider it from the execution level.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Compared with AutoGPT, Wall-Facing Intelligence and Tsinghua University innovatively introduced a "Double Loop Mechanism# in the design of XAgent ## :

  • Outer loop: Responsible for global task planning and decomposing complex tasks into operable simple tasks.
  • Inner loop: Responsible for local task execution and focus on details.
Through the cooperation of the double-loop mechanism, XAgent is like a "superhero" in the field of large models. It shows super professionalism when dealing with different aspects of complex tasks. and rich skills.

Just like "Captain America" ​​in the Marvel universe, XAgent has both overall leadership and meticulous execution.

In the outer loop, XAgent shows leadership as a

"PlanAgent". It will split complex tasks into several simple tasks, and Oversee the complete process of problem solving.

First, it decomposes a given complex task into smaller, more manageable "subtasks", generates "

Initial planning", and forms a task sequence.

Subsequently, it will pass each subtask to the inner loop for resolution. During this process, the outer loop will continuously monitor the progress and status of the task, and conduct "

iterative optimization" on subsequent plans based on feedback.

In the inner loop, XAgent quickly changes its identity, showing its professionalism as an efficient

"executor" (ToolAgent), ensuring that the child passed by the outer loop The task met expectations.

Depending on the nature of the subtask, it can retrieve tools from external systems and solve the subtask step by step.

After the subtask is completed, it will generate a reflection of the current subtask execution process and feed it back to the outer loop to indicate whether the current task is completed and potential optimization points in task execution.

As shown in the figure, the user submitted the iris.zip file to XAgent for XAgent to analyze the data.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

As you can see, XAgent first decomposes this task into 4 subtasks through the outer loop:

  1. Check and understand the data;
  2. Check the Python environment of the system to see if the relevant data analysis library exists;
  3. Write data analysis code to process and analyze data;
  4. Write analysis reports based on the python code execution results.

Subsequently, when executing each sub-task, XAgent skillfully uses file reading and writing, shell commands, python notebook and corresponding pandas, sci-kit learn, Data analysis libraries such as seaborn and matplotlib can even perform visual analysis on data.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

##AutoGPT while performing the same task, and There was no plan to check the Python environment and related libraries. Instead, I started writing code and executing it directly, which resulted in failure and error reporting when using related libraries. In the end, I did not complete the complex analysis of the data.

Human-computer collaboration: a new paradigm of agent interaction

Although AutoGPT breaks through the limitations of the traditional GPT model to a certain extent, it There are still phenomena of execution errors such as infinite loops and incorrect calls, which require manual intervention to solve.

XAgent has considered related issues at the beginning of its design and introduced an interaction mechanism specifically designed to enhance human-machine collaboration: it can interact with users autonomously and provide guidance to humans. Make requests for intervention and guidance.

For an intelligent agent, "Whether it can cooperate with humans" is also an important indicator that reflects its intelligence.

First of all, XAgent has an intuitive interface that allows users to directly override or modify the suggestions it makes, effectively combining AI efficiency with human intuition and expertise. .

Secondly, when faced with unfamiliar challenges, XAgent has the ability to "ask for help from humans". It will solicit real-time feedback, suggestions or guidance from users to ensure that even in uncertain situations In the field, the intelligent agent can also play its best role.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgentPicture

##This interactive paradigm combines the autonomy of AI Organically integrated with human wisdom, it demonstrates a new collaborative relationship between people and XAgent.

As shown in the picture, the user wants XAgent to help recommend some delicious restaurants to party with friends, but no specific and detailed information is provided.

At this time, XAgent can realize that the information provided by the current user is not enough to make recommendations, so it makes a request to humans and asks the user's preferred location, budget range, taste preferences, and availability. What are taboos, etc., and recommended restaurants are provided after receiving feedback from users.

AutoGPT, on the other hand, directly started searching for restaurant information on the Internet for recommendations. The final recommended results were in the wrong location, did not consider the user's budget, and did not meet the user's needs.

Efficient communication language, super tool call

Regardless of the "dual cycle" operating mechanism or the interactive capability of "human-machine collaboration", in the overall design of XAgent In this project, the Wall-Facing Intelligence and Tsinghua University teams focused on the core features of the intelligent agent such as stability, efficiency and safety.

And structured communication method is also one of the important factors in building a strong and stable intelligent agent.

XAgent uses Function Call as its internal communication language, which has the advantages of structuring, standardization, and unification.

  • Structured: Function Call has a clear and rigorous format that clearly expresses the required content, thus minimizing potential errors.
  • Standardization: Function Call can standardize the interaction process with external tools and provide a common language so that the agent has the ability to use and The ability to integrate multiple tools to solve complex tasks.
  • Unification: By converting all links such as information summary, task planning, and tool execution into specific Function Call forms, ensure that each All aspects are handled in a unified manner, thus simplifying system design.

#In addition, tool invocation is also one of the important abilities to evaluate whether the AI ​​Agent has the ability to solve complex problems.

XAgent created an original tool execution engine ToolServer in its design, which can achieve safer, more efficient, and scalable tool execution capabilities.

It runs in an isolated Docker environment, ensuring that tool execution does not compromise the stability or security of the main system.

This design brings multiple benefits:

  • ##Safety: Running tools inside Docker containers protects the main system from potential compromise.
  • Efficient: The system can start, stop and restart nodes based on demand and usage patterns to achieve optimal resources use.
  • Extensible: Convenient to manage code, with stronger debugging and scalability.

The key components of ToolServer include: ToolServerNode, ToolServerMonitor, and ToolServerManager, which provide powerful capabilities in execution of operations, node inspection, cycle management, etc.

Currently, XAgent's ToolSever supports FileSystemEnv, PythonNotoBook, WebEnv, ExecuteShell, RapidAPIEnv, AskHumanforHelp and other tools.

XAgent can not only help us do some simple tasks, it can even help us train models.

For example, users hope to analyze movie reviews and determine the quality of the public's evaluation of the movie. At this time, XAgent will first download the imdb data set to train a BERT model, and use the trained BERT model to predict movie reviews.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

#release the big Model potential, comprehensively surpassing AutoGPT

After testing in a series of tasks, it can be seen (as shown in Figures a and b below) that the performance of XAgent based on GPT-4 is in all benchmarks In the test, it surpassed the original GPT-4 and completely surpassed AutoGPT.

These tasks require Agent reasoning planning and the ability to use external tools, including: the ability to answer questions with search engines (FreshQA HotpotQA), Python programming ability (MBPP), mathematical reasoning ability ( MATH), interactive programming ability (InterCode), embodied reasoning ability (ALFWorld), real complex tasks, etc.

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgentFigure a: XAgent comprehensively surpasses AutoGPT in real complex task processing

Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent

Figure b: XAgent surpasses AutoGPTXAgent in six It can be seen that XAgent's system design can fully release the basic capabilities of GPT-4 and achieve extremely high test results and human Human Preference.

This not only shows that XAgent performs well in traditional AI tests that require inferential planning, but also has higher performance when processing complex real-world instructions

Expand the application boundary and solidify the technical foundation

The emergence of AI Agent has allowed the entire industry to see the important implementation direction of large model technology, and the entire set can be realized without the need for complex prompt exploration. Workflow task execution.

As a large model "superhero" with unlimited potential, XAgent can become a "personal assistant" for every ordinary person. It can help us plan our schedule, arrange itineraries, and manage time and resource allocation in life and work.

It can also independently use a variety of data collection, processing and analysis tools to fully automatically analyze massive data and form reports to help users obtain important information efficiently.

In addition, XAgent can combine external tools with autonomous planning algorithms to make decisions based on environmental information to achieve more efficient and accurate task execution.

XAgent’s R&D team is formed by a number of experts and scholars in the field of large models from Wall-Facing Intelligence and Tsinghua University’s THUNLP Laboratory. They are more like the "superheroes" of the large model world.

The reason why this innovative achievement can be successfully launched is that the team has built a series of cutting-edge innovative large-scale models Infra during the long-term scientific research work, solidifying the technical foundation, expanding innovation and R&D boundaries.

Wallface Intelligence teamed up with Tsinghua University NLP Laboratory and OpenBMB open source community to create a "Trinity" large model industry-university-research ecological layout, and proposed and released multiple large model tool usage frameworks and Engine:

Tool Learning: Large model tool learning paradigm integrates the advantages of professional tools and large models to achieve higher accuracy and efficiency in problem solving and autonomy.

  • BMTools: Large model learning engine is an open source warehouse that allows language models to use extension tools. It is also an open source community building and sharing tool platform.
  • ToolLLM, a large model tool learning framework, connects large models to 16,000 real APIs, allowing large models to complete more complex user command tasks by calling external tools.
  • WebCPM, the first model framework in the Chinese field that supports Internet search, fills the gap in the field of domestic large models, allowing large models to search for answers on web pages in real time like humans, improving the AIGC real-time and accuracy.

The above is the detailed content of Comprehensively surpassing AutoGPT, Wall-Facing Intelligence cooperates with Tsinghua NLP Laboratory’s open source large model “Superhero” XAgent. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete