Home  >  Article  >  Technology peripherals  >  Microsoft debuts the 2.7 billion parameter Phi-2 model, which outperforms many large language models

Microsoft debuts the 2.7 billion parameter Phi-2 model, which outperforms many large language models

WBOY
WBOYforward
2023-12-14 23:17:471236browse

Microsoft released an artificial intelligence model called Phi-2, which has demonstrated extraordinary capabilities, with performance comparable to or even exceeding that of larger and more mature models that are 25 times larger.

Recently Microsoft announced in a blog post that Phi-2 is a language model with 2.7 billion parameters. Compared with other basic models, Phi-2 shows advanced performance, especially in complex benchmark tests. , these tests assess reasoning, language comprehension, mathematics, coding, and general knowledge skills. Now Phi-2 has been released through the model catalog of Microsoft Azure Artificial Intelligence Studio, which means researchers and developers can integrate it into third-party applications.

Phi-2 was announced by Microsoft CEO Satya Nadella first announced it at the Ignite conference in November. The product’s power comes from what Microsoft calls “textbook-quality” data that is purpose-built for knowledge and also draws on insights technology from other models

What’s Unique about Phi-2 , in the past, the capabilities of large language models were often closely related to their parameter scale. Generally speaking, a model with more parameters means more powerful capabilities. However, the emergence of Phi-2 has changed this traditional concept. Microsoft said that Phi-2 has demonstrated the ability to match or even surpass larger base models in some benchmark tests. These benchmarks include Mistral AI’s 7 billion parameter Mistral, Meta Platforms’ 13 billion parameter Llama 2, and even surpassing the 70 billion parameter Llama-2 in some benchmarks

A surprise The argument may be that it even outperforms Google's Gemini Nano, the most efficient model in the Gemini series released last week. Gemini Nano is designed for on-device tasks and can run on smartphones, enabling features such as text summarization, advanced proofreading, grammar correction, and contextual smart replies

Microsoft researchers say that Phi-2 involves The tests are extensive and include language comprehension, reasoning, mathematics, coding challenges, and more.

Microsoft debuts the 2.7 billion parameter Phi-2 model, which outperforms many large language modelsThe company says that Phi-2 achieves such excellent results because it is trained with carefully selected textbook-level data designed to Teaches reasoning, knowledge and common sense, meaning it can learn more from less information. Microsoft researchers also used techniques that allow knowledge to be obtained from smaller models.

The researchers pointed out that it is worth noting that Phi-2 is still able to achieve strong performance without using techniques such as reinforcement learning or instructional fine-tuning based on human feedback. These techniques are often used to improve the behavior of artificial intelligence models. Despite not using these techniques, Phi-2 still performs well in reducing bias and harmful content compared to other open source models that do. The company believes this is due to the role of customized work in data collation The latest version of. Phi-1, the first model in the series and first released earlier this year, has 1.3 billion parameters and is fine-tuned for basic Python coding tasks. In September this year, Microsoft launched Phi-1.5, a model with 1.3 billion parameters and trained using new data sources, including a variety of synthetic text generated with natural language programming

Microsoft said , Phi-2’s high efficiency makes it an ideal platform for researchers to explore areas such as enhancing artificial intelligence safety, interpretability, and the ethical development of language models.

The above is the detailed content of Microsoft debuts the 2.7 billion parameter Phi-2 model, which outperforms many large language models. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete