Home  >  Article  >  Technology peripherals  >  Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters

Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters

PHPz
PHPzforward
2023-04-14 18:58:011647browse

Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters

News on February 25th, Meta announced on Friday local time that it will launch a new large-scale language model based on artificial intelligence (AI) for the research community, in partnership with Microsoft, Google and other companies stimulated by ChatGPT have joined the artificial intelligence competition.

Meta's LLaMA is the abbreviation of "Large Language Model Meta AI" (Large Language Model Meta AI), which is available under a non-commercial license to researchers and entities in government, community, and academia.

The company will make the underlying code available to users, so they can tweak the model themselves and use it for research-related use cases. Meta said the model’s computing power requirements are “much lower.”

According to reports, the company is developing LLaMA with multiple parameters (7B, 13B, 33B and 65B). Among them, LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, and the smallest model LLaMA 7B was also trained on 1 trillion tokens.

Like other large language models, LLaMA works by taking a sequence of words as "input" and predicting the next word to recursively generate text. For this set of models, Meta selected text from the 20 most spoken languages ​​for training, focusing on Latin and Cyrillic.

Of course, like other models, LLaMA also faces the challenges of bias, toxic comments, and hallucinations, and Meta needs to do more research to address the shortcomings in this type of language model.

Meta said that LLaMA as a base model is designed to be versatile and can be applied to many different use cases, rather than a fine-tuned model designed for a specific task. By open sourcing LLaMA's code, other researchers can more easily find new ways to limit or eliminate these problems. Meta also provides in this article a set of benchmark evaluation criteria for assessing model bias and toxicity to show model limitations and support researchers in further research in this critical area.

It is worth mentioning that Meta also launched the large language model OPT-175B in May last year. The project is also aimed at researchers, which forms the basis for a new iteration of its chatbot blenderbot.

Later, the company also launched a model called Galactica, which it said could write scientific articles and solve mathematical problems, but its demo version was later removed from the shelves because It repeatedly generates “authoritative-sounding” content.

IT Home with official link:

The above is the detailed content of Meta launches AI language model LLaMA, a large-scale language model with 65 billion parameters. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete