With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4-AI-php.cn

With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4

王林

Oct 31, 2023 pm 06:17 PM

getting StartedHundreds of billions of parametersTongyi Qianwen 2.0

Alibaba Cloud officially released Tongyi Qianwen 2.0, a large model with hundreds of billions of parameters, on October 31. According to 10 authoritative evaluation results, the comprehensive performance of Tongyi Qianwen 2.0 exceeds GPT-3.5 and is quickly catching up with GPT-4. On the same day, Tongyi Qianwen APP was launched in major mobile application markets. Anyone can directly experience the capabilities of the latest model through the APP

In the past 6 months, Tongyi Qianwen 2.0 has made a huge leap in performance Compared with version 1.0 released in April, Tongyi Qianwen 2.0 has significantly improved its capabilities in complex command understanding, literary creation, general mathematics, knowledge memory, and hallucination resistance. At present, the comprehensive performance of Tongyi Qianwen has exceeded GPT-3.5, accelerating to catch up with GPT-4.

With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4

The comprehensive performance of Tongyi Qianwen 2.0 exceeds GPT-3.5 and is accelerating to catch up with GPT-4

In MMLU , C-Eval, GSM8K, HumanEval, MATH and other 10 mainstream Benchmark evaluation sets, the score of Tongyi Qianwen 2.0 overall surpassed Meta's Llama-2-70B, and compared with OpenAI's Chat-3.5, it was nine wins and one loss. Compared with GPT-4, it is four wins and six losses, and the gap with GPT-4 is further narrowed.

Chinese and English understanding ability is the basic skill of large language models. In terms of English tasks, Tongyi Qianwen 2.0 scored 82.5 on the MMLU benchmark, second only to GPT-4. By significantly increasing the number of parameters, Tongyi Qianwen 2.0 can better understand and process complex language structures and concepts; Chinese In terms of tasks, Tongyi Qianwen 2.0 achieved the highest score on the C-Eval benchmark with a clear advantage. This is because the model learned more Chinese corpus during training, further strengthening its Chinese understanding and expression capabilities.

In areas such as mathematical reasoning and code understanding, Tongyi Qianwen 2.0 has made significant progress. In the reasoning benchmark test GSM8K, Tongyi Qianwen ranked second, demonstrating strong computing and logical reasoning capabilities; in the HumanEval test, Tongyi Qianwen's score closely followed GPT-4 and GPT-3.5, which mainly measures large-scale The ability of the model to understand and execute code fragments is the basis for large models to be used in scenarios such as programming assistance and automatic code repair.

With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4

Tongyi Qianwen 2.0 released

Tongyi Qianwen is more mature and easier to use. Tongyi Qianwen 2.0 has made technical optimizations in terms of instruction compliance, tool use, refined creation, etc., so that it can be better integrated into downstream application scenarios. The official website of Tongyi Large Model has launched multi-modal and plug-in functions, supporting segmented tasks such as image input and document parsing.

At the same time, eight major industry model groups based on Tongyi large model training were launched. They are Tongyi Lingma - intelligent coding assistant, Tongyi Zhiwen - AI reading assistant, and Tongyi listening - work Learning AI assistant, Tongyi Stardust - personalized character creation platform, Tongyi Midianjin - intelligent investment research assistant, Tongyi Xiaomi - intelligent customer service, Tongyi Renxin - personal health assistant, Tongyi Farui - AI law consultant. The 8 major industry models are designed for the most popular vertical scenarios and are specially trained using domain data. Users can directly experience model functions on the official website, and developers can integrate model capabilities into their own large model applications and services through web page embedding, API/SDK calls, etc.

With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4

Tongyi large model family has been fully upgraded, and 8 major industry model groups have been launched

As of October, Alibaba Cloud has carried out in-depth cooperation with leading partners in more than 60 industries to promote the practical application of General Qianwen in the fields of office, cultural tourism, electric power, government affairs, medical insurance, transportation, manufacturing, finance, software development and other fields

Zhou Jingren said that Alibaba Cloud plans to open source the 72B version of Tongyi Qianwen in the near future. Previously, Alibaba Cloud has open sourced the 7B and 14B versions of the model, and the cumulative downloads of these models have exceeded 1 million. Alibaba Cloud will continue to support developers in various industries to use the Tongyi Qianwen open source model to innovate models and applications

With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4

Tongyi Qianwen 72B will be open source soon

The above is the detailed content of With hundreds of billions of parameters, Alibaba Cloud Tongyi Qianwen has evolved to 2.0: performance exceeding GPT-3.5 and accelerating to catch up with GPT-4. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:机器之心. If there is any infringement, please contact admin@php.cn delete

How to Build Your Personal AI Assistant with Huggingface SmolLMApr 18, 2025 am 11:52 AM

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityApr 18, 2025 am 11:49 AM

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentApr 18, 2025 am 11:44 AM

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Comprehensive Guide to Python Built-in Data Structures - Analytics VidhyaApr 18, 2025 am 11:43 AM

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

First Impressions From OpenAI's New Models Compared To AlternativesApr 18, 2025 am 11:41 AM

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

AI Portfolio | How to Build a Portfolio for an AI Career?Apr 18, 2025 am 11:40 AM

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

What Agentic AI Could Mean For Security OperationsApr 18, 2025 am 11:36 AM

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Google Versus OpenAI: The AI Fight For StudentsApr 18, 2025 am 11:31 AM

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Saving in R.E.P.O. Explained (And Save Files)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks agoByDDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.