Home  >  Article  >  Technology peripherals  >  The just exposed Claude3 directly attacks the biggest weakness of OpenAI

The just exposed Claude3 directly attacks the biggest weakness of OpenAI

PHPz
PHPzforward
2024-03-10 08:07:251022browse

Enterprise-level SOTA large model, what signals does Anthropic's Claude3 release?

Author | Wan Chen

Editor | Jingyu

is an entrepreneurial project as the head of OpenAI GPT3 R&D, Anthropic is seen as the startup that can best compete with OpenAI.

Anthropic released a set of large Claude 3 series models on Monday local time, claiming that its most powerful model outperformed OpenAI’s GPT-4 and Google’s Gemini 1.0 Ultra in various benchmark tests. .

However, the ability to handle more complex reasoning tasks, be more intelligent, and respond faster, these comprehensive capabilities that rank among the top three large models are only the basic skills of Claude3.

Anthropic is committed to becoming the best partner for corporate customers.

This is first reflected in Claude3, which is a set of models: Haiku, Sonnet and Opus, allowing enterprise customers to choose versions with different performance and different costs according to their own scenarios.

Secondly, Anthropic emphasizes that its own model is the safest. Anthropic President Daniela Amodei introduced that a technology called "Constitutional Artificial Intelligence" was introduced in Claude3's training to enhance its safety, trustworthiness, and reliability. Fu Yao, a doctoral student in large models and reasoning at the University of Edinburgh, pointed out after reading Claude3’s technical report that Claude3 performed well in benchmark tests of complex reasoning, especially in the financial and medical fields. As a ToB company, Anthropic chooses to focus on optimizing the areas with the most profit potential.

Now, Anthropic is open to use two models of the Claude3 series (Haiku and Sonnet) in 159 countries, and the most powerful version, Opus, is also about to be launched. At the same time, Anthropic also provides services through the cloud platforms of Amazon and Google, the latter of which invested US$4 billion and US$2 billion respectively in Anthropic.

Co-founders Dario Amodei and Daniela Amodei said that the release of Claude 3 once again shows that "Anthropic is more of an enterprise company than a company. Consumer company."|Image source: Anthropic刚刚曝光的 Claude3,直击 OpenAI 最大弱点
01,
Smarter, more responsive

Claude3 Family: Opus , Sonnet and HaikuAccording to Anthropic’s official website, Claude3 is a series of models that includes three state-of-the-art models: Claude 3 Haiku, Claude 3 Sonnet and Claude 3 Opus, allowing users to choose for their specific applications The best balance of intelligence, speed and cost. In terms of the general capabilities of the model, Anthropic said that the Claude 3 series "sets a new industry benchmark for a wide range of cognitive tasks" in analysis and prediction, detailed content generation, code generation, and Spanish, Japanese In terms of conversations with non-English languages ​​such as French, it has a stronger ability and is more timely in task response.

Among them, Claude 3 Opus is the most intelligent model in this group of models, especially in processing highly complex tasks. Opus outperforms its peers in most common benchmarks, including Undergraduate Level Expert Knowledge (MMLU), Graduate Level Expert Reasoning (GPQA), Basic Mathematics (GSM8K), and more. It shows near-human-level understanding and fluency on complex tasks. It is currently Anthropic's most cutting-edge exploration of general intelligence, "demonstrating the outer limits of generative artificial intelligence."

Claude3 Model Family|Image Source: Anthropic刚刚曝光的 Claude3,直击 OpenAI 最大弱点
Claude 3 Sonnet achieves the ideal balance between intelligence level and responsiveness Balance, especially for tasks in enterprise scenarios. It delivers powerful performance at a lower cost than similar products and is designed for high endurance in large-scale AI deployments. For the vast majority of workloads, Sonnet is 2x faster and more intelligent than Claude 2 and Claude 2.1. It excels at tasks that require fast responses, such as knowledge retrieval or sales automation.
The Claude 3 Haiku is the most compact model and also the most cost-effective. Moreover, its response speed is also very fast, and it can read information containing charts, graphs, and data-intensive research papers (about 10k tokens) on arXiv in less than three seconds.

02,

Iteration targeting enterprise customers

Co-founder Daniela Amodei introduced that in addition to the advancement of general intelligence, Anthropic pays special attention to enterprise customers. There are many challenges faced when integrating generative AI into their business. Aimed at enterprise customers, the Claude3 family offers improvements in visual capabilities, accuracy, long text input, and security. Many corporate customers have knowledge bases in multiple formats, including PDF, flowcharts or presentation slides. Claude 3 Series models can now handle content in a variety of visual formats, including photos, charts, graphs and technical diagrams. Claude3 has also been optimized for accuracy and capabilities with long text windows.

In terms of accuracy, Anthropic uses a large number of complex factual questions to target known weaknesses in current models, classifying answers into correct answers, incorrect answers (or hallucinations) and acknowledging uncertainty. Accordingly, the Claude3 model indicates that it does not know the answer, rather than providing incorrect information. The strongest version of them all, Claude 3 Opus, doubled the accuracy (or correct answers) on challenging open-ended questions than Claude 2.1, while also reducing the level of incorrect answers.

刚刚曝光的 Claude3,直击 OpenAI 最大弱点
Compared with the Claude2.1 version, the Claude3 series has comprehensively improved response accuracy. |Image source: Anthropic

At the same time, due to the improvement in context understanding capabilities, the Claude3 family will make fewer rejections in response to user tasks compared to previous versions.

In addition to a more accurate reply, Anthropic said it will bring to Claude 3 "citation" feature that can point to precise sentences in reference materials to verify their answers.

Currently, Claude 3 series models will provide a context window of 200K tokens. Subsequently, all three models will be able to accept inputs of more than 1 million tokens, and this capability will be provided to select customers who require enhanced processing capabilities. Anthropic briefly elaborated on Claude3’s upper text window capabilities in its technical report, including its ability to effectively handle longer contextual cue words and its recall capabilities.

03, "Constitutional Artificial Intelligence", Dealing with "Inexact Science"

It is worth noting that, As a multi-modal model, Claude3 can input images but cannot output image content. Co-founder Daniela Amodei said this is because "we found that businesses have much less need for images."

The release of Claude3 was released after the controversy caused by the images generated by Google Gemini. Claude, which is aimed at enterprise customers, is also bound to control and balance issues such as value bias caused by AI.

In this regard, Dario Amodei emphasized the difficulty of controlling artificial intelligence models, calling it "inexact science." He said the company has a dedicated team dedicated to assessing and mitigating the various risks posed by the model.

Another co-founder, Daniela Amodei, also admitted that completely unbiased AI may not be possible with current methods. "Creating a completely neutral generative AI tool is nearly impossible, not only technically, but also because not everyone agrees on what neutrality is," she said. .

刚刚曝光的 Claude3,直击 OpenAI 最大弱点 Previously, Anthropic announced the "Constitutional Artificial Intelligence" used to align large models | Image source: Anthropic
Despite this , Anthropic uses a method called "Constitutional Artificial Intelligence" to make the model consistent with the broad range of human values ​​as much as possible. The model follows the principles defined in the "Constitution" to adjust and optimize.
As the core human developers of OpenAI, the departure of the Amodei brothers and sisters is similar to Musk’s recent complaint against OpenAI, believing that OpenAI is no longer a non-profit organization and no longer follows its original mission to benefit mankind. . A reporter asked Amodei, does Anthropic fit your vision of starting a business abroad?

Amodei said: "Being at the forefront of the development of artificial intelligence is the most effective way to guide the development trajectory of artificial intelligence and bring positive results to society."

This article comes from the WeChat public account: Geek Park (ID: geekpark), author: Wan Chen

The above is the detailed content of The just exposed Claude3 directly attacks the biggest weakness of OpenAI. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:ithome.com. If there is any infringement, please contact admin@php.cn delete