Home  >  Article  >  Technology peripherals  >  LLaMa 3 may be postponed to July, targeting GPT-4 and learning lessons from Gemini

LLaMa 3 may be postponed to July, targeting GPT-4 and learning lessons from Gemini

王林
王林forward
2024-03-01 11:19:02998browse

Past image generation models have often been criticized for rendering predominantly white images, and Google’s Gemini model has fallen into trouble for being an extreme overkill. Its generated image results became overly cautious and deviated significantly from historical facts, surprising users. Google claims the model is more discreet than developers expected. This caution is reflected not only in the generated images, but also in often treating some prompts as sensitive and thus refusing to provide answers.

As this issue continues to attract attention, how to strike a balance between security and usability has become a huge challenge for Meta. LLaMA 2 is regarded as a "strong player" in the open source field and has also become Meta's star model. It changed the situation of large models once it was launched. Currently, Meta is fully preparing to launch LLaMa 3, but first needs to solve the problems left by LLaMA 2: it appeared to be too conservative in answering controversial questions.

LLaMa 3或将推迟到7月发布,剑指GPT-4,从Gemini吸取教训

Finding a balance between security and usability

Meta added safeguards in Llama 2 ,preventing LLM from answering various controversial questions. While this conservatism is necessary to handle extreme cases, such as queries related to violence or illegal activity, it also limits the model's ability to answer more common but slightly controversial questions. According to The Information, when he asked LLaMA 2 how employees can avoid going into the office on days when they are required to come to the office, he was refused advice or told that "it is important to respect and abide by the company's policies and guidelines." ”. LLaMA 2 also refuses to provide answers on how to prank your friends, win a war, or wreck a car engine. This conservative answer is intended to avoid a PR disaster.

However, it was revealed that Meta’s senior leadership and some researchers involved in the model work believed that LLaMA 2’s answers were too “safe.” Meta is working to make the upcoming LLaMA 3 model more flexible and provide more contextual information when providing answers, rather than rejecting answers outright. Researchers are trying to make LLaMA 3 more interactive with users and better understand what they might mean. It is reported that the new version of the model will be better able to distinguish multiple meanings of a word. For example, LLaMA 3 might understand that a question about how to destroy a car's engine refers to how to shut it down, not to destroy it. Meta also plans to appoint an in-house person to handle tone and safety training in the coming weeks, The Information reports, as part of the company's efforts to make model responses more nuanced.

The challenge that Meta and Google need to overcome is not just finding this balance point, but many technology giants have also been affected to varying degrees. They need to work hard to build products that everyone loves, can use, and work smoothly, while also ensuring that those products are safe and reliable. This is a problem that technology companies must face head-on as they catch up with AI technology.

More information on LLaMa 3

The release of LLaMa 3 is highly anticipated and Meta plans to release it in July, but there is still time Subject to change. Meta CEO Mark Zuckerberg is ambitious and once said, "Although Llama 2 is not the industry-leading model, it is the best open source model. For LLaMa 3 and subsequent models, our goal is to build SOTA, and eventually became the industry-leading model."

LLaMa 3或将推迟到7月发布,剑指GPT-4,从Gemini吸取教训

Original address: https://www.reuters.com/technology/meta-plans -launch-new-ai-language-model-llama-3-july-information-reports-2024-02-28/

Meta I hope LLaMa 3 can catch up with OpenAI’s GPT- 4. Meta company staff revealed that it has not yet been decided whether LLaMa 3 will be multi-modal and whether it will be able to understand and generate text and images, because the researchers have not yet begun fine-tuning the model. However, LLaMa is expected to have more than 14 billion parameters, which will significantly exceed LLaMa 2, indicating a significant improvement in its ability to handle complex queries.

十分な 350,000 H100 と数百億ドルを管理することに加えて、才能も LLaMa 3 トレーニングの「必需品」です。 Meta は、基礎 AI 研究チームとは別の生成 AI グループを通じて LLaMa を開発しています。 LLaMa 2と3の安全性を担当していた研究者のルイス・マーティン氏は2月に同社を退職した。強化学習を率いていたケビン・ストーン氏も今月退職した。これが LLaMa 3 のトレーニングに影響を与えるかどうかは不明です。 LLaMa 3 がセキュリティと使いやすさのバランスをうまく取り、コーディング機能の面で新たな驚きを与えてくれるかどうか、様子を見守りたいと思います。

The above is the detailed content of LLaMa 3 may be postponed to July, targeting GPT-4 and learning lessons from Gemini. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete