Home >Technology peripherals >AI >Microsoft launches 'Learn from Mistakes' model training method, claiming to 'imitate the human learning process and improve AI reasoning capabilities'

Microsoft launches 'Learn from Mistakes' model training method, claiming to 'imitate the human learning process and improve AI reasoning capabilities'

王林
王林forward
2023-11-07 17:13:04831browse

Microsoft Research Asia, in conjunction with Peking University, Xi'an Jiaotong University and other universities, recently proposed an artificial intelligence training method called "Learning from Mistakes (LeMA)". This method claims to be able to improve the reasoning ability of artificial intelligence by imitating the human learning process

微软推出 “从错误中学习” 模型训练法,号称可“模仿人类学习过程,改善 AI 推理能力”

At present, large language models such as OpenAI GPT-4 and Google aLM-2 are widely used in natural language. It has good performance in processing (NLP) tasks and chain-of-thought (CoT) reasoning mathematical puzzle tasks.

But open source large models such as LLaMA-2 and Baichuan-2 need to be strengthened when dealing with related issues. In order to improve the thinking chain reasoning capabilities of these large open source language models, the research team proposed the LeMA method. This method mainly imitates the human learning process and improves the model's reasoning ability by "learning from mistakes".

微软推出 “从错误中学习” 模型训练法,号称可“模仿人类学习过程,改善 AI 推理能力”

▲ Picture source related papers

This site found that the

researcher’s method is to use a pair of "wrong answers" and "corrected answers" Correct answer" data to fine-tune the relevant model. In order to obtain relevant data, the researchers collected the wrong answers and reasoning processes of 5 different large language models (including LLaMA and GPT series), and then used GPT-4 as a "revisor" to provide corrected answers.

It is reported that the revised correct answer contains three types of information, namely the error fragments in the original reasoning process, the reasons for the error in the original reasoning process, and how to modify the original method to obtain the correct answer.

The researchers used GSM8K and MATH to test the effect of the LeMa training method on 5 open source large models. The results show that in the improved LLaMA-2-70B model, the accuracy rates of GSM8K are 83.5% and 81.4% respectively, while the accuracy rates of MATH are 25.0% and 23.6% respectively

Currently, researchers have LeMA related information is public on GitHub. Interested friends can

click here to jump.

The above is the detailed content of Microsoft launches 'Learn from Mistakes' model training method, claiming to 'imitate the human learning process and improve AI reasoning capabilities'. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete