


The just exposed Claude3 directly attacks the biggest weakness of OpenAI
Enterprise-level SOTA large model, what signals does Anthropic's Claude3 release?
Author | Wan Chen
Editor | Jingyu
is an entrepreneurial project as the head of OpenAI GPT3 R&D, Anthropic is seen as the startup that can best compete with OpenAI.
Anthropic released a set of large Claude 3 series models on Monday local time, claiming that its most powerful model outperformed OpenAI’s GPT-4 and Google’s Gemini 1.0 Ultra in various benchmark tests. .
However, the ability to handle more complex reasoning tasks, be more intelligent, and respond faster, these comprehensive capabilities that rank among the top three large models are only the basic skills of Claude3.
Anthropic is committed to becoming the best partner for corporate customers.
This is first reflected in Claude3, which is a set of models: Haiku, Sonnet and Opus, allowing enterprise customers to choose versions with different performance and different costs according to their own scenarios.
Secondly, Anthropic emphasizes that its own model is the safest. Anthropic President Daniela Amodei introduced that a technology called "Constitutional Artificial Intelligence" was introduced in Claude3's training to enhance its safety, trustworthiness, and reliability. Fu Yao, a doctoral student in large models and reasoning at the University of Edinburgh, pointed out after reading Claude3’s technical report that Claude3 performed well in benchmark tests of complex reasoning, especially in the financial and medical fields. As a ToB company, Anthropic chooses to focus on optimizing the areas with the most profit potential.
Now, Anthropic is open to use two models of the Claude3 series (Haiku and Sonnet) in 159 countries, and the most powerful version, Opus, is also about to be launched. At the same time, Anthropic also provides services through the cloud platforms of Amazon and Google, the latter of which invested US$4 billion and US$2 billion respectively in Anthropic.

Claude3 Family: Opus , Sonnet and HaikuAccording to Anthropic’s official website, Claude3 is a series of models that includes three state-of-the-art models: Claude 3 Haiku, Claude 3 Sonnet and Claude 3 Opus, allowing users to choose for their specific applications The best balance of intelligence, speed and cost. In terms of the general capabilities of the model, Anthropic said that the Claude 3 series "sets a new industry benchmark for a wide range of cognitive tasks" in analysis and prediction, detailed content generation, code generation, and Spanish, Japanese In terms of conversations with non-English languages such as French, it has a stronger ability and is more timely in task response.
Among them, Claude 3 Opus is the most intelligent model in this group of models, especially in processing highly complex tasks. Opus outperforms its peers in most common benchmarks, including Undergraduate Level Expert Knowledge (MMLU), Graduate Level Expert Reasoning (GPQA), Basic Mathematics (GSM8K), and more. It shows near-human-level understanding and fluency on complex tasks. It is currently Anthropic's most cutting-edge exploration of general intelligence, "demonstrating the outer limits of generative artificial intelligence."

02,
Iteration targeting enterprise customersCo-founder Daniela Amodei introduced that in addition to the advancement of general intelligence, Anthropic pays special attention to enterprise customers. There are many challenges faced when integrating generative AI into their business. Aimed at enterprise customers, the Claude3 family offers improvements in visual capabilities, accuracy, long text input, and security. Many corporate customers have knowledge bases in multiple formats, including PDF, flowcharts or presentation slides. Claude 3 Series models can now handle content in a variety of visual formats, including photos, charts, graphs and technical diagrams. Claude3 has also been optimized for accuracy and capabilities with long text windows.
In terms of accuracy, Anthropic uses a large number of complex factual questions to target known weaknesses in current models, classifying answers into correct answers, incorrect answers (or hallucinations) and acknowledging uncertainty. Accordingly, the Claude3 model indicates that it does not know the answer, rather than providing incorrect information. The strongest version of them all, Claude 3 Opus, doubled the accuracy (or correct answers) on challenging open-ended questions than Claude 2.1, while also reducing the level of incorrect answers.

At the same time, due to the improvement in context understanding capabilities, the Claude3 family will make fewer rejections in response to user tasks compared to previous versions.
In addition to a more accurate reply, Anthropic said it will bring to Claude 3 "citation" feature that can point to precise sentences in reference materials to verify their answers.
Currently, Claude 3 series models will provide a context window of 200K tokens. Subsequently, all three models will be able to accept inputs of more than 1 million tokens, and this capability will be provided to select customers who require enhanced processing capabilities. Anthropic briefly elaborated on Claude3’s upper text window capabilities in its technical report, including its ability to effectively handle longer contextual cue words and its recall capabilities.
03, "Constitutional Artificial Intelligence", Dealing with "Inexact Science"
It is worth noting that, As a multi-modal model, Claude3 can input images but cannot output image content. Co-founder Daniela Amodei said this is because "we found that businesses have much less need for images."
The release of Claude3 was released after the controversy caused by the images generated by Google Gemini. Claude, which is aimed at enterprise customers, is also bound to control and balance issues such as value bias caused by AI.
In this regard, Dario Amodei emphasized the difficulty of controlling artificial intelligence models, calling it "inexact science." He said the company has a dedicated team dedicated to assessing and mitigating the various risks posed by the model.
Another co-founder, Daniela Amodei, also admitted that completely unbiased AI may not be possible with current methods. "Creating a completely neutral generative AI tool is nearly impossible, not only technically, but also because not everyone agrees on what neutrality is," she said. .

This article comes from the WeChat public account: Geek Park (ID: geekpark), author: Wan Chen
The above is the detailed content of The just exposed Claude3 directly attacks the biggest weakness of OpenAI. For more information, please follow other related articles on the PHP Chinese website!

This tutorial guides you through building a serverless image processing pipeline using AWS services. We'll create a Next.js frontend deployed on an ECS Fargate cluster, interacting with an API Gateway, Lambda functions, S3 buckets, and DynamoDB. Th

This pilot program, a collaboration between the CNCF (Cloud Native Computing Foundation), Ampere Computing, Equinix Metal, and Actuated, streamlines arm64 CI/CD for CNCF GitHub projects. The initiative addresses security concerns and performance lim

This Go-based network vulnerability scanner efficiently identifies potential security weaknesses. It leverages Go's concurrency features for speed and includes service detection and vulnerability matching. Let's explore its capabilities and ethical


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Chinese version
Chinese version, very easy to use

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

Notepad++7.3.1
Easy-to-use and free code editor