WizardLM-2, which is 'very close to GPT-4', was urgently withdrawn by Microsoft. What's the inside story?-AI-php.cn

Home

Technology peripherals

WizardLM-2, which is 'very close to GPT-4', was urgently withdrawn by Microsoft. What's the inside story?

PHPz

Apr 30, 2024 pm 04:40 PM

gitaiModel

Some time ago, Microsoft made an own mistake: it grandly open sourced WizardLM-2, and then withdrew it cleanly soon after.

Currently queryable WizardLM-2 release information, this is an open source large model "truly comparable to GPT-4", with performance in complex chat, multi-language, reasoning and agency has been improved.

The series includes three models: WizardLM-2 8x22B, WizardLM-2 70B and WizardLM-2 7B. Among them:

WizardLM-2 8x22B is the most advanced model and the best open source LLM based on internal evaluation for highly complex tasks.
WizardLM-2 70B has top inference capabilities and is the first choice of the same scale;
WizardLM-2 7B is the fastest, among which Performance is comparable to existing open source leading models that are 10 times larger.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Additionally, based on human preference evaluation, WizardLM-28x22B’s capabilities “were only slightly behind the GPT-4-1106 preview, but Significantly stronger than CommandRPlus and GPT4-0314."

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

##It will become another popular version like LLaMa 3. Open source milestone?

While everyone was busy downloading the model, the team suddenly withdrew everything: blog, GitHub, HuggingFace all got 404.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Picture source: https://wizardlm.github.io/WizardLM2/

Team’s The explanation is:

Hello to all Huggingface friends! Sorry, we removed the model. It's been a while since we released a model from a few months ago, so we're not familiar with the new release process now: we accidentally left out a necessary item in the model release process - toxicity testing. This is a step that all new models currently need to complete.

We are currently completing this test quickly and will re-release our model as soon as possible. Don't worry, thank you for your concern and understanding.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

However, the AI community’s attention and discussion on WizardLM-2 has not stopped. There are several doubts:

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

First, the deleted open source projects are not just WizardLM-2, all the Wizard series work of the team is gone, including the previous WizardMath and WizardCoder.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Secondly, some people question why the blog is also deleted when the model weights are deleted? If it is just missing the testing part, there is no need to withdraw it completely.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

#The team’s explanation is: “According to relevant regulations.” What are the specific regulations? No one knows yet.

Third, there are also speculations that the team behind WizardLM has been fired, and that the withdrawal of the Wizard series project was also forced.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

However, this speculation was denied by the team:

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Picture source: https://x.com/_Mira___Mira_/status/1783716276944486751

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

##Picture source: https://x.com/ DavidFSWD/status/1783682898786152470

And when we search for the author’s name now, it has not completely disappeared from Microsoft’s official website:

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Image source: https://www.microsoft.com/en-us/research/people/qins/

Fourth, some people speculate that Microsoft has withdrawn this The open source model is, firstly, because its performance is too close to GPT-4, and secondly, because it “collides” with OpenAI’s technical route.

What is the specific route? We can take a look at the technical details of the original blog page.

The team stated that through LLM training, human-generated data in nature is increasingly exhausted, and AI-carefully created data and AI Step-by-Step supervised models will be the gateway to more powerful The only way to go with AI.

Over the past year, the Microsoft team has built a synthetic training system fully powered by artificial intelligence, as shown in the figure below.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

is roughly divided into several sections:

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Data preprocessing:

Data analysis: Use this pipeline to obtain the distribution of different attributes of new source data, which helps to have a preliminary understanding of the data understanding.
Weighted sampling: The distribution of the best training data is often inconsistent with the natural distribution of human chat corpus, and the weight of each attribute in the training data needs to be adjusted based on experimental experience.

WizardLM-2, which is very close to GPT-4, was urgently withdrawn by Microsoft. Whats the inside story?

Evol Lab:

Evol-Instruct: A lot of effort has been invested in re-evaluating the various problems existing in the original Evol-Instruct method and making preliminary modifications to it. The new method can allow various agents to automatically Generate high-quality instructions.
Evol-Answer: Guide the model to generate and rewrite responses multiple times, which can improve its logic, correctness and affinity.

AI Align AI (AAA):

Co-teaching: Collect WizardLM and each Licensing state-of-the-art models, both open source and proprietary, and then letting them work together to teach and improve each other, including mock chats, quality critiques, suggestions for improvements, and closing skill gaps.
Self-Teaching: WizardLM can generate new evolutionary training data for supervised learning and preference data for reinforcement learning through activation learning.

Learning:

Supervised learning.
Stage - DPO: In order to perform offline reinforcement learning more effectively, the preferred data is divided into different fragments and the model is improved step by step.
RLEIF: Using a method that combines the instruction quality reward model (IRM) and the process supervision reward model (PRM) to achieve more precise correctness in online reinforcement learning.

The last thing to say is that any speculation is in vain, let us look forward to the comeback of WizardLM-2.

The above is the detailed content of WizardLM-2, which is 'very close to GPT-4', was urgently withdrawn by Microsoft. What's the inside story?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Gemma Scope: Google's Microscope for Peering into AI's Thought ProcessApr 17, 2025 am 11:55 AM

Exploring the Inner Workings of Language Models with Gemma Scope Understanding the complexities of AI language models is a significant challenge. Google's release of Gemma Scope, a comprehensive toolkit, offers researchers a powerful way to delve in

Who Is a Business Intelligence Analyst and How To Become One?Apr 17, 2025 am 11:44 AM

Unlocking Business Success: A Guide to Becoming a Business Intelligence Analyst Imagine transforming raw data into actionable insights that drive organizational growth. This is the power of a Business Intelligence (BI) Analyst – a crucial role in gu

How to Add a Column in SQL? - Analytics VidhyaApr 17, 2025 am 11:43 AM

SQL's ALTER TABLE Statement: Dynamically Adding Columns to Your Database In data management, SQL's adaptability is crucial. Need to adjust your database structure on the fly? The ALTER TABLE statement is your solution. This guide details adding colu

Business Analyst vs. Data AnalystApr 17, 2025 am 11:38 AM

Introduction Imagine a bustling office where two professionals collaborate on a critical project. The business analyst focuses on the company's objectives, identifying areas for improvement, and ensuring strategic alignment with market trends. Simu

What are COUNT and COUNTA in Excel? - Analytics VidhyaApr 17, 2025 am 11:34 AM

Excel data counting and analysis: detailed explanation of COUNT and COUNTA functions Accurate data counting and analysis are critical in Excel, especially when working with large data sets. Excel provides a variety of functions to achieve this, with the COUNT and COUNTA functions being key tools for counting the number of cells under different conditions. Although both functions are used to count cells, their design targets are targeted at different data types. Let's dig into the specific details of COUNT and COUNTA functions, highlight their unique features and differences, and learn how to apply them in data analysis. Overview of key points Understand COUNT and COU

Chrome is Here With AI: Experiencing Something New Everyday!!Apr 17, 2025 am 11:29 AM

Google Chrome's AI Revolution: A Personalized and Efficient Browsing Experience Artificial Intelligence (AI) is rapidly transforming our daily lives, and Google Chrome is leading the charge in the web browsing arena. This article explores the exciti

AI's Human Side: Wellbeing And The Quadruple Bottom LineApr 17, 2025 am 11:28 AM

Reimagining Impact: The Quadruple Bottom Line For too long, the conversation has been dominated by a narrow view of AI’s impact, primarily focused on the bottom line of profit. However, a more holistic approach recognizes the interconnectedness of bu

5 Game-Changing Quantum Computing Use Cases You Should Know AboutApr 17, 2025 am 11:24 AM

Things are moving steadily towards that point. The investment pouring into quantum service providers and startups shows that industry understands its significance. And a growing number of real-world use cases are emerging to demonstrate its value out

See all articles