search
HomeTechnology peripheralsAIQuantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

Tencent’s research team conducted a study on the scalability of agents. They found that through simple sampling voting, the performance of large language models (LLMs) increases with the number of instantiated agents. This study has verified the universality of this phenomenon in various scenarios for the first time, compared it with other complex methods, explored the reasons behind this phenomenon, and proposed methods to further exert the scaling effect.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

  • Paper title: More Agents Is All You Need

  • Paper address: https://arxiv .org/abs/2402.05120

  • Code address: https://github.com/MoreAgentsIsAllYouNeed/More-Agents-Is-All-You-Need

In this article, researchers from Tencent found that: through a simple sampling voting method, the performance of large language models will increase as the number of instantiated agents increases, showing scaling property (can Scalability), without the support of complex multi-LLM agents collaboration framework and prompt engineering methods. Furthermore, this method is orthogonal to existing sophisticated methods and, when combined, can further enhance LLM to a degree related to task difficulty. This paper did the first study on the scaling property of raw agents (referring to LLM agents that do not rely on complex prompt engineering and collaboration frameworks). It conducted comprehensive experiments on various LLM benchmarks to verify the universality of this finding. , and examine strategies that can facilitate its occurrence. The code is now open source.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
## Multiple models exceeded the big model

Thesis detailed discussed a variety of integrated LLM related related related LLM Research, including LLM self-integration, heterogeneous LLM integration, and research on multiple LLM agent collaboration frameworks. By comparing with the proposed method, it can be seen that the paper has conducted a more comprehensive research and analysis.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
To study how the performance of large language models improves as the number of instantiated agents increases. The paper uses a simple sampling and voting method (the author uses the term simple (st), which shows that they think this method may be one of the simplest methods). Notably, this method can be orthogonally combined with existing complex methods. It can be divided into two stages:

  • Input task query into a single LLM or multiple LLM Agents collaboration framework to generate multiple outputs ;
  • The final result is determined by majority voting
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
The paper selects different scales from the Llama2 and GPT series Language models are evaluated on task datasets covering multiple domains such as inference and generation. Experimental results show that on all tasks and LLMs of different types and sizes, it is found that the performance of LLM increases with the number of instantiated agents.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

For example, the improvement is 12% to 24% on the GSM8K task and 6% to 10% on the MATH task. Interestingly, ensembles of multiple small LLMs can match or even exceed the performance of larger LLMs.
For example, an ensemble of multiple Llama2-13Bs achieved 59% accuracy on GSM8K, exceeding the 54% accuracy of a single Llama2-70B.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

###
Further, the author also explored ’s compatibility with other methods. Although these methods are implemented differently, when used in combination with them, the performance can be further improved, and are also consistent with the phenomenon that the more agents are instantiated, the stronger the performance gain. The experimental results show that the gain ranges from 1% to 27%, indicating that this simple method can further enhance the performance of LLM by using it orthogonally with other methods.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

#                           Based on LLama13B

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

##                               Based on LLama70B

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

Based on GPT-3.5-Turbo In addition, the paper also analyzes the relationship between

performance improvement and problem difficulty.

    Intrinsic difficulty: As the inherent difficulty of the task increases, the performance improvement (ie, relative performance gain) also increases will increase, but when the difficulty reaches a certain level, the gain will gradually decrease. This shows that when the task is too complex, the model's reasoning ability may not be able to keep up, resulting in diminishing marginal effects of performance improvements.
  • Number of steps: As the number of steps required to solve a task increases, so does the performance gain. This shows that in multi-step tasks, increasing the number of agents can help the model handle each step better, thereby overall improving task solving performance.
  • Prior probability: The higher the prior probability of the correct answer, the greater the performance improvement. This means that increasing the number of agents is more likely to lead to significant performance improvements when the correct answer is more likely.
Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
Nodes: steps, dashed lines: possible alternative steps. Depth of nodes: number of steps, intensity of colors: level of inherent difficulty. The illustration helps the reader understand how task complexity is measured along these dimensions.

Based on this, the paper proposes two optimization strategies to further improve the effectiveness of the method:

    Step-wise Sampling-and-Voting: This method breaks the task into multiple steps and applies sampling and voting at each step to reduce accumulation errors and improve overall performance.
  • Hierarchical Sampling-and-Voting: This method decomposes low-probability tasks into multiple high-probability subtasks and solves them hierarchically. At the same time, it can be used Different models are used to handle subtasks with different probabilities to reduce costs.

  • Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model
  • Finally, future work directions are proposed, including optimizing the sampling stage to reduce costs, and continuing to develop related mechanisms to mitigate the effects of LLM hallucinations. potential negative impacts, ensuring that the deployment of these powerful models is both responsible and beneficial.

The above is the detailed content of Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete
腾讯dns地址是多少腾讯dns地址是多少Feb 22, 2023 am 10:43 AM

腾讯dns地址是“119.29.29.29”;类似于其他公共DNS,如Google的“8.8.8.8”和114dns的“114.114.114.114”,可以为全网用户提供域名的公共递归解析服务。

qq是腾讯的吗qq是腾讯的吗Oct 09, 2022 am 11:34 AM

qq是腾讯的。QQ是1999年2月由腾讯公司推出的一款基于互联网的即时通信网络工具,其标志是一只戴着红色围巾的小企鹅;QQ支持在线聊天、视频通话、点对点断点续传文件、共享文件、网络硬盘、自定义面板、QQ邮箱等多种功能,并可与多种通讯终端相连。

腾讯宣布调整微信支付和视频号组织架构,加大对“直播带货”的投资腾讯宣布调整微信支付和视频号组织架构,加大对“直播带货”的投资Jan 12, 2024 pm 04:30 PM

根据科创板日报的报道,微信视频号正在加大对直播带货的资源投入,为此已经对微信支付和视频号两个团队的组织架构进行了调整据知情人士透露,腾讯的目的是为了实现微信支付和视频号的互通,希望两个团队能够合作共同努力。据称,这次调整计划于今年双11之前后开始,旨在将更多资源投入到“直播带货”领域根据本站查询结果显示,腾讯是一家著名的互联网公司,自成立以来已经多次进行组织架构调整,目前包括六大事业群和S线企业发展事业群(CDG)云与智慧产业事业群(CSIG)互动娱乐事业群(IEG)平台与内容事业群(PCG)技

提升用户体验:腾讯QQ NT桌面版内存优化再升级提升用户体验:腾讯QQ NT桌面版内存优化再升级Aug 11, 2023 pm 04:57 PM

腾讯QQ桌面客户端近期进行了一系列重大改革,针对用户反馈的高内存占用、超大安装包和启动缓慢等问题,QQ技术团队在内存方面进行了专项优化,取得了一定进展据了解,新版QQ在内存方面面临几个主要挑战。首先,产品形态相对复杂,由一个大面板和多个独立功能窗口构成,窗口与渲染进程一一对应,窗口进程数量对Electron的内存占用产生影响。如果不能对这个复杂的大面板进行精细控制,很容易导致内存持续增加。其次,用户习惯长时间挂机,相对于Web页面,QQ用户可能会挂机一个月以上,因此需要控制内存使用,避免内存持续

腾讯获得杭州亚运会转播权,电竞项目将于9月24日开始腾讯获得杭州亚运会转播权,电竞项目将于9月24日开始Sep 16, 2023 am 09:05 AM

本站9月15日消息,今天是杭州第19届亚运会倒计时第8天,腾讯集团宣布与中央广播电视总台达成合作,成为杭州亚运会持权转播商。腾讯集团宣布,杭州亚运会将于2023年9月23日至10月8日举行。届时,用户可以通过腾讯视频、腾讯体育、腾讯新闻、腾讯网、微信、微视、王者营地、和平营地、掌上英雄联盟、虎牙直播等平台观看亚运会的所有比赛转播和回放值得一提的是,腾讯旗下4大竞技项目将作为正式竞赛项目登上本次亚运会舞台。据本站此前报道,杭州亚运会的电竞赛事赛程已经出炉,电子竞技将作为智力项目于9月24日开赛,连

一文带你了解腾讯自主研发的通用大语言模型——混元大模型一文带你了解腾讯自主研发的通用大语言模型——混元大模型Sep 12, 2023 pm 08:21 PM

2023年9月7日上午,在腾讯全球数字生态大会上,腾讯集团高级执行副总裁、腾讯云与智慧产业事业群CEO汤道生宣布,腾讯将进入“全面拥抱大模型”时代,并同时宣布,腾讯自主研发的通用大语言模型——混元,正式向产业亮相。根据腾讯官方表示,混元大模型的中文能力已经超过GPT3.5发布后混元大模型将作为腾讯云MaaS服务的底座,用户可以通过腾讯云官网进行体验,并且支持直接调用API接口,也可可以将混元作为基底模型,并在公有云上根据企业的实际需求进行自定义调整。一、混元大模型简介二、计费方面腾讯混元大模型将

微信和腾讯地图升级“小修小补引路行动”,全国首张“一刻钟便民生活圈地图”上线微信和腾讯地图升级“小修小补引路行动”,全国首张“一刻钟便民生活圈地图”上线Nov 17, 2023 pm 02:45 PM

本站11月16日消息,微信和腾讯地图在今年发起了“小修小补引路行动”,展现出200多座城市的超50万家修补小店。在商务部指导下,“小修小补引路行动”全面升级,在微信搜一搜“小修小补”“修鞋”“修自行车”“裁缝”“修表”“修电器”“修锁”“配钥匙”“管道疏通”“修手机”等关键词,不仅可以直达小修小补便民主题地图,附近小店的信息都会直接标记出来,大大增加了小店的曝光率。“小修小补便民地图”小程序列出了每家小店的营业时间、具体地址以及联系电话等重要信息,方便用户更快找到小店。本站注意到,此前腾讯利用A

腾讯混元大模型正式亮相,我们抢先试了试它的生产力腾讯混元大模型正式亮相,我们抢先试了试它的生产力Sep 08, 2023 pm 07:57 PM

国内首批大型模型备案上周获批,开始向全社会开放服务,标志着大型模型进入了规模应用的新阶段。在之前发布应用的公司中,一些科技巨头似乎还没有行动在2023年9月7日,腾讯在腾讯全球数字生态大会上正式公开了混元大模型,并向外界开放了腾讯云作为一个超千亿参数的大模型,混元使用的预训练语料超过两万亿token,凭借多项独有的技术能力获得了强大的中文创作能力、复杂语境下的逻辑推理能力,以及可靠的任务执行能力。腾讯集团副总裁蒋杰表示:「腾讯混元大模型是从第一个token开始从零训练的,我们掌握了从模型算法到机

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.