search
HomeTechnology peripheralsAIQuantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

Tencent’s research team conducted a study on the scalability of agents. They found that through simple sampling voting, the performance of large language models (LLMs) increases with the number of instantiated agents. This study has verified the universality of this phenomenon in various scenarios for the first time, compared it with other complex methods, explored the reasons behind this phenomenon, and proposed methods to further exert the scaling effect.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

  • Paper title: More Agents Is All You Need

  • Paper address: https://arxiv .org/abs/2402.05120

  • Code address: https://github.com/MoreAgentsIsAllYouNeed/More-Agents-Is-All-You-Need

In this article, researchers from Tencent found that: through a simple sampling voting method, the performance of large language models will increase as the number of instantiated agents increases, showing scaling property (can Scalability), without the support of complex multi-LLM agents collaboration framework and prompt engineering methods. Furthermore, this method is orthogonal to existing sophisticated methods and, when combined, can further enhance LLM to a degree related to task difficulty. This paper did the first study on the scaling property of raw agents (referring to LLM agents that do not rely on complex prompt engineering and collaboration frameworks). It conducted comprehensive experiments on various LLM benchmarks to verify the universality of this finding. , and examine strategies that can facilitate its occurrence. The code is now open source.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
## Multiple models exceeded the big model

Thesis detailed discussed a variety of integrated LLM related related related LLM Research, including LLM self-integration, heterogeneous LLM integration, and research on multiple LLM agent collaboration frameworks. By comparing with the proposed method, it can be seen that the paper has conducted a more comprehensive research and analysis.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
To study how the performance of large language models improves as the number of instantiated agents increases. The paper uses a simple sampling and voting method (the author uses the term simple (st), which shows that they think this method may be one of the simplest methods). Notably, this method can be orthogonally combined with existing complex methods. It can be divided into two stages:

  • Input task query into a single LLM or multiple LLM Agents collaboration framework to generate multiple outputs ;
  • The final result is determined by majority voting
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
The paper selects different scales from the Llama2 and GPT series Language models are evaluated on task datasets covering multiple domains such as inference and generation. Experimental results show that on all tasks and LLMs of different types and sizes, it is found that the performance of LLM increases with the number of instantiated agents.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

For example, the improvement is 12% to 24% on the GSM8K task and 6% to 10% on the MATH task. Interestingly, ensembles of multiple small LLMs can match or even exceed the performance of larger LLMs.
For example, an ensemble of multiple Llama2-13Bs achieved 59% accuracy on GSM8K, exceeding the 54% accuracy of a single Llama2-70B.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

###
Further, the author also explored ’s compatibility with other methods. Although these methods are implemented differently, when used in combination with them, the performance can be further improved, and are also consistent with the phenomenon that the more agents are instantiated, the stronger the performance gain. The experimental results show that the gain ranges from 1% to 27%, indicating that this simple method can further enhance the performance of LLM by using it orthogonally with other methods.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

#                           Based on LLama13B

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

##                               Based on LLama70B

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

Based on GPT-3.5-Turbo In addition, the paper also analyzes the relationship between

performance improvement and problem difficulty.

    Intrinsic difficulty: As the inherent difficulty of the task increases, the performance improvement (ie, relative performance gain) also increases will increase, but when the difficulty reaches a certain level, the gain will gradually decrease. This shows that when the task is too complex, the model's reasoning ability may not be able to keep up, resulting in diminishing marginal effects of performance improvements.
  • Number of steps: As the number of steps required to solve a task increases, so does the performance gain. This shows that in multi-step tasks, increasing the number of agents can help the model handle each step better, thereby overall improving task solving performance.
  • Prior probability: The higher the prior probability of the correct answer, the greater the performance improvement. This means that increasing the number of agents is more likely to lead to significant performance improvements when the correct answer is more likely.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
Nodes: steps, dashed lines: possible alternative steps. Depth of nodes: number of steps, intensity of colors: level of inherent difficulty. The illustration helps the reader understand how task complexity is measured along these dimensions.

Based on this, the paper proposes two optimization strategies to further improve the effectiveness of the method:

    Step-wise Sampling-and-Voting: This method breaks the task into multiple steps and applies sampling and voting at each step to reduce accumulation errors and improve overall performance.
  • Hierarchical Sampling-and-Voting: This method decomposes low-probability tasks into multiple high-probability subtasks and solves them hierarchically. At the same time, it can be used Different models are used to handle subtasks with different probabilities to reduce costs.

  • Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
  • Finally, future work directions are proposed, including optimizing the sampling stage to reduce costs, and continuing to develop related mechanisms to mitigate the effects of LLM hallucinations. potential negative impacts, ensuring that the deployment of these powerful models is both responsible and beneficial.

The above is the detailed content of Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete
MuleSoft Formulates Mix For Galvanized Agentic AI ConnectionsMuleSoft Formulates Mix For Galvanized Agentic AI ConnectionsMay 07, 2025 am 11:18 AM

Both concrete and software can be galvanized for robust performance where needed. Both can be stress tested, both can suffer from fissures and cracks over time, both can be broken down and refactored into a “new build”, the production of both feature

OpenAI Reportedly Strikes $3 Billion Deal To Buy WindsurfOpenAI Reportedly Strikes $3 Billion Deal To Buy WindsurfMay 07, 2025 am 11:16 AM

However, a lot of the reporting stops at a very surface level. If you’re trying to figure out what Windsurf is all about, you might or might not get what you want from the syndicated content that shows up at the top of the Google Search Engine Resul

Mandatory AI Education For All U.S. Kids? 250-Plus CEOs Say YesMandatory AI Education For All U.S. Kids? 250-Plus CEOs Say YesMay 07, 2025 am 11:15 AM

Key Facts Leaders signing the open letter include CEOs of such high-profile companies as Adobe, Accenture, AMD, American Airlines, Blue Origin, Cognizant, Dell, Dropbox, IBM, LinkedIn, Lyft, Microsoft, Salesforce, Uber, Yahoo and Zoom.

Our Complacency Crisis: Navigating AI DeceptionOur Complacency Crisis: Navigating AI DeceptionMay 07, 2025 am 11:09 AM

That scenario is no longer speculative fiction. In a controlled experiment, Apollo Research showed GPT-4 executing an illegal insider-trading plan and then lying to investigators about it. The episode is a vivid reminder that two curves are rising to

Build Your Own Warren Buffett Agent in 5 MinutesBuild Your Own Warren Buffett Agent in 5 MinutesMay 07, 2025 am 11:00 AM

What if you could ask Warren Buffett about a stock, market trends, or long-term investing, anytime you wanted? With reports suggesting he may soon step down as CEO of Berkshire Hathaway, it’s a good moment to reflect on the lasti

Meta AI App: Now Powered by the Capabilities of Llama 4Meta AI App: Now Powered by the Capabilities of Llama 4May 07, 2025 am 10:59 AM

Meta AI has been at the forefront of the AI revolution since the advent of its Llama chatbot. Their latest offering, Llama 4, has helped them gain a foothold in the race. From smarter conversations to creating videos, sketching i

Top 7 Computer Use AgentsTop 7 Computer Use AgentsMay 07, 2025 am 10:58 AM

The advent of AI has been game-changing, transforming the way we interact with technology. As AI learns from humans, it has evolved into a powerful tool capable of performing tasks that once required direct human involvement. One

5 Insights by Satya Nadella and Mark Zuckerberg on Future of AI5 Insights by Satya Nadella and Mark Zuckerberg on Future of AIMay 07, 2025 am 10:35 AM

If you’re an AI enthusiast like me, you have probably had many sleepless nights. It’s challenging to keep up with all AI updates. Last week, a major event took place: Meta’s first-ever LlamaCon. The event started with

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)