Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4-AI-php.cn

Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Oct 31, 2023 pm 06:05 PM

Ali CloudTongyi Qianwen

On October 31, Alibaba Cloud officially released Tongyi Qianwen 2.0, a large model with hundreds of billions of parameters. In 10 authoritative evaluations, the comprehensive performance of Tongyi Qianwen 2.0 exceeded GPT-3.5 and is currently Accelerate to catch up with GPT-4. On the same day, Tongyi Qianwen APP was officially launched in major mobile application markets, and everyone can directly experience the latest model capabilities through the APP.

In the past six months, Tongyi Qianwen 2.0 has made a huge leap in performance. Compared with version 1.0 released in April, Tongyi Qianwen 2.0has been significantly improvedin the abilities of understanding complex instructions, literary creation, general mathematics, knowledge memory, and resisting hallucinations. At present, the comprehensive performance of

Tongyi Qianwen has exceeded GPT-3.5, accelerating to catch up with GPT-4.

Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4

Picture: Tongyi Qianwen 2.0 comprehensive performancehas exceeded GPT-3.5 and is accelerating to catch up GPT-4

in MMLU, C-Eval, GSM8K, HumanEval, MATH, etc. 10 On a

mainstream benchmark evaluation set, Tongyi Qianwen 2.0's overall score surpassed Meta's Llama-2-70B, compared with OpenAI's Chat-3.5, it was nine wins and one loss, and compared with GPT-4, it was With four wins and six losses, the gap with GPT-4 has further narrowed.

The ability to understand Chinese and English is the basic skill of a large language model.

In terms of English tasks, Tongyi Qianwen 2.0 scored 82.5 on the MMLU benchmark, second only to GPT-4. By significantly increasing the number of parameters, Tongyi Qianwen 2.0 can better understand and process complex tasks. In terms of language structure and concepts; in terms of Chinese tasks, Tongyi Qianwen 2.0 achieved the highest score on the C-Eval benchmark with a clear advantage. This is because the model learned more Chinese corpus during training, further strengthening its Chinese understanding and expression capabilities.

In areas such as mathematical reasoning and code understanding, Tongyi Qianwen 2.0 has made significant progress. In the reasoning benchmark test GSM8K, Tongyi Qianwen ranked second, demonstrating strong computing and logical reasoning capabilities; in the HumanEval test, Tongyi Qianwen's score closely followed GPT-4 and GPT-3.5, which mainly measures large-scale The ability of the model to understand and execute code fragments is the basis for large models to be used in scenarios such as programming assistance and automatic code repair.

Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4

##Picture: Tongyi Qianwen 2.0release

##Tongyi Qianwen is more mature and easier to use. Tongyi Qianwen 2.0 has made technical optimizations in terms of instruction compliance, tool use, refined creation, etc. can be better integrated into downstream application scenarios. Tongyi Large Model official website has launched multi-modal and plug-in functions, supporting segmented tasks such as image input and document parsing.

At the same time, eight major industry model groups based on Tongyi large model training were launched. They are Tongyi Lingma-Intelligent Coding Assistant, Tongyi Zhiwen-AI Reading Assistant, Tongyi Listening-Work and Study AI Assistant. ##、Tongyi Xiaomi-Intelligent Customer Service、 Tongyi Renxin-Personal Exclusive health assistant , Tongyi Farui-AI legal advisor. 8 major industry models are oriented to the most popular vertical scenarios, using domain data for specialized training. Users can directly experience model functions on the official website, and developers can integrate model capabilities into their own large model applications and services through web page embedding, API/SDK calls, etc. Picture: Tongyi large model family has been fully upgraded, 8 major industry modelsgroups are online

Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4

As of October, Alibaba Cloud has conducted in-depth cooperation with more than 60 industry leaders , to promote the implementation of Tongyi Qianwen in the fields of office, cultural tourism, electric power, government affairs, medical insurance, transportation, manufacturing, finance, software development and other fields.

Zhou Jingren revealed that Alibaba Cloud plans to open source the 72B version of Tongyi Qianwen in the near future. Previously, Alibaba Cloud has open sourced the 7B and 14B version models, and the cumulative number of

model downloads Over 1 million

. Alibaba Cloud will continue to support developers from thousands of industries to innovate models and applications based on the Tongyi Qianwen open source model.

Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4

Picture: Tongyi Qianwen 72B will be open source soon

The above is the detailed content of Alibaba Cloud releases General Question Answering 2.0, which surpasses GPT-3.5 in performance and accelerates its pursuit of GPT-4. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

A Business Leader's Guide To Generative Engine Optimization (GEO)May 03, 2025 am 11:14 AM

Google is leading this shift. Its "AI Overviews" feature already serves more than one billion users, providing complete answers before anyone clicks a link.[^2] Other players are also gaining ground fast. ChatGPT, Microsoft Copilot, and Pe

This Startup Is Using AI Agents To Fight Malicious Ads And Impersonator AccountsMay 03, 2025 am 11:13 AM

In 2022, he founded social engineering defense startup Doppel to do just that. And as cybercriminals harness ever more advanced AI models to turbocharge their attacks, Doppel’s AI systems have helped businesses combat them at scale— more quickly and

How World Models Are Radically Reshaping The Future Of Generative AI And LLMsMay 03, 2025 am 11:12 AM

Voila, via interacting with suitable world models, generative AI and LLMs can be substantively boosted. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including

May Day 2050: What Have We Left To Celebrate?May 03, 2025 am 11:11 AM

Labor Day 2050. Parks across the nation fill with families enjoying traditional barbecues while nostalgic parades wind through city streets. Yet the celebration now carries a museum-like quality — historical reenactment rather than commemoration of c

The Deepfake Detector You've Never Heard Of That's 98% AccurateMay 03, 2025 am 11:10 AM

To help address this urgent and unsettling trend, a peer-reviewed article in the February 2025 edition of TEM Journal provides one of the clearest, data-driven assessments as to where that technological deepfake face off currently stands. Researcher

Quantum Talent Wars: The Hidden Crisis Threatening Tech's Next FrontierMay 03, 2025 am 11:09 AM

From vastly decreasing the time it takes to formulate new drugs to creating greener energy, there will be huge opportunities for businesses to break new ground. There’s a big problem, though: there’s a severe shortage of people with the skills busi

The Prototype: These Bacteria Can Generate ElectricityMay 03, 2025 am 11:08 AM

Years ago, scientists found that certain kinds of bacteria appear to breathe by generating electricity, rather than taking in oxygen, but how they did so was a mystery. A new study published in the journal Cell identifies how this happens: the microb

AI And Cybersecurity: The New Administration's 100-Day ReckoningMay 03, 2025 am 11:07 AM

At the RSAC 2025 conference this week, Snyk hosted a timely panel titled “The First 100 Days: How AI, Policy & Cybersecurity Collide,” featuring an all-star lineup: Jen Easterly, former CISA Director; Nicole Perlroth, former journalist and partne

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

1 months agoByDDD

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Blue Prince: How To Get To The Basement

3 weeks agoByDDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Linux new version

SublimeText3 Linux latest version

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Hot Topics

1653

1413

1304

1251

1224