


Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4
On October 31, Alibaba Cloud officially released Tongyi Qianwen 2.0, a large model with hundreds of billions of parameters. In 10 authoritative evaluations, the comprehensive performance of Tongyi Qianwen 2.0 exceeded GPT-3.5 and is currently Accelerate to catch up with GPT-4. On the same day, Tongyi Qianwen APP was officially launched in major mobile application markets, and everyone can directly experience the latest model capabilities through the APP.
In the past six months, Tongyi Qianwen 2.0 has made a huge leap in performance. Compared with version 1.0 released in April, Tongyi Qianwen 2.0has been significantly improvedin the abilities of understanding complex instructions, literary creation, general mathematics, knowledge memory, and resisting hallucinations. At present, the comprehensive performance of
Tongyi Qianwen has exceeded GPT-3.5, accelerating to catch up with GPT-4.Picture: Tongyi Qianwen 2.0 comprehensive performancehas exceeded GPT-3.5 and is accelerating to catch up GPT-4
in MMLU, C-Eval, GSM8K, HumanEval, MATH, etc. 10 On a
mainstream benchmark evaluation set, Tongyi Qianwen 2.0's overall score surpassed Meta's Llama-2-70B, compared with OpenAI's Chat-3.5, it was nine wins and one loss, and compared with GPT-4, it was With four wins and six losses, the gap with GPT-4 has further narrowed.The ability to understand Chinese and English is the basic skill of a large language model.
In terms of English tasks, Tongyi Qianwen 2.0 scored 82.5 on the MMLU benchmark, second only to GPT-4. By significantly increasing the number of parameters, Tongyi Qianwen 2.0 can better understand and process complex tasks. In terms of language structure and concepts; in terms of Chinese tasks, Tongyi Qianwen 2.0 achieved the highest score on the C-Eval benchmark with a clear advantage. This is because the model learned more Chinese corpus during training, further strengthening its Chinese understanding and expression capabilities.In areas such as mathematical reasoning and code understanding, Tongyi Qianwen 2.0 has made significant progress. In the reasoning benchmark test GSM8K, Tongyi Qianwen ranked second, demonstrating strong computing and logical reasoning capabilities; in the HumanEval test, Tongyi Qianwen's score closely followed GPT-4 and GPT-3.5, which mainly measures large-scale The ability of the model to understand and execute code fragments is the basis for large models to be used in scenarios such as programming assistance and automatic code repair.
##Picture: Tongyi Qianwen 2.0release
##Tongyi Qianwen is more mature and easier to use. Tongyi Qianwen 2.0 has made technical optimizations in terms of instruction compliance, tool use, refined creation, etc. can be better integrated into downstream application scenarios. Tongyi Large Model official website has launched multi-modal and plug-in functions, supporting segmented tasks such as image input and document parsing.
At the same time, eight major industry model groups based on Tongyi large model training were launched. They are Tongyi Lingma-Intelligent Coding Assistant, Tongyi Zhiwen-AI Reading Assistant, Tongyi Listening-Work and Study AI Assistant. ##、Tongyi Xiaomi-Intelligent Customer Service、 Tongyi Renxin-Personal Exclusive health assistant , Tongyi Farui-AI legal advisor. 8 major industry models are oriented to the most popular vertical scenarios, using domain data for specialized training. Users can directly experience model functions on the official website, and developers can integrate model capabilities into their own large model applications and services through web page embedding, API/SDK calls, etc. Picture: Tongyi large model family has been fully upgraded, 8 major industry modelsgroups are online
As of October, Alibaba Cloud has conducted in-depth cooperation with more than 60 industry leaders , to promote the implementation of Tongyi Qianwen in the fields of office, cultural tourism, electric power, government affairs, medical insurance, transportation, manufacturing, finance, software development and other fields.
Zhou Jingren revealed that Alibaba Cloud plans to open source the 72B version of Tongyi Qianwen in the near future. Previously, Alibaba Cloud has open sourced the 7B and 14B version models, and the cumulative number of
. Alibaba Cloud will continue to support developers from thousands of industries to innovate models and applications based on the Tongyi Qianwen open source model.
Picture: Tongyi Qianwen 72B will be open source soon
The above is the detailed content of Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4. For more information, please follow other related articles on the PHP Chinese website!

Google is leading this shift. Its "AI Overviews" feature already serves more than one billion users, providing complete answers before anyone clicks a link.[^2] Other players are also gaining ground fast. ChatGPT, Microsoft Copilot, and Pe

In 2022, he founded social engineering defense startup Doppel to do just that. And as cybercriminals harness ever more advanced AI models to turbocharge their attacks, Doppel’s AI systems have helped businesses combat them at scale— more quickly and

Voila, via interacting with suitable world models, generative AI and LLMs can be substantively boosted. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including

Labor Day 2050. Parks across the nation fill with families enjoying traditional barbecues while nostalgic parades wind through city streets. Yet the celebration now carries a museum-like quality — historical reenactment rather than commemoration of c

To help address this urgent and unsettling trend, a peer-reviewed article in the February 2025 edition of TEM Journal provides one of the clearest, data-driven assessments as to where that technological deepfake face off currently stands. Researcher

From vastly decreasing the time it takes to formulate new drugs to creating greener energy, there will be huge opportunities for businesses to break new ground. There’s a big problem, though: there’s a severe shortage of people with the skills busi

Years ago, scientists found that certain kinds of bacteria appear to breathe by generating electricity, rather than taking in oxygen, but how they did so was a mystery. A new study published in the journal Cell identifies how this happens: the microb

At the RSAC 2025 conference this week, Snyk hosted a timely panel titled “The First 100 Days: How AI, Policy & Cybersecurity Collide,” featuring an all-star lineup: Jen Easterly, former CISA Director; Nicole Perlroth, former journalist and partne


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Linux new version
SublimeText3 Linux latest version

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),
