The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI-AI-php.cn

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

PHPz

Apr 13, 2023 pm 05:13 PM

codetrainlama

The lightweight version of ChatGPT based on the Meta model is here?

Just three days after Meta announced the launch of LLaMA, an open source training method that turned it into ChatGPT appeared in the industry, claiming that the training speed is up to 15 times faster than ChatGPT.

LLaMA is an ultra-fast and ultra-small GPT-3 launched by Meta. The number of parameters is only 10% of the latter, and it only requires a single GPU to run.

The method to turn it into ChatGPT is called ChatLLaMA, which is trained based on RLHF (reinforcement learning based on human feedback) and quickly became popular on the Internet.

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

So, Meta’s open source version of ChatGPT is really coming?

Wait a minute, things are not that simple.

Training LLaMA into ChatGPT's "open source method"

Click on the ChatLLaMA project homepage and you will find that it actually integrates four parts -

DeepSpeed, RLHF method, LLaMA and data sets generated based on LangChain agent.

Among them, DeepSpeed is an open source deep learning training optimization library, including an existing optimization technology called Zero, which is used to improve large model training capabilities. Specifically, it refers to helping the model improve training speed, reduce costs, improve model availability, etc. .

RLHF will use the reward model to fine-tune the pre-trained model. The reward model first uses multiple models to generate questions and answers, and then relies on manual sorting of the questions and answers so that it can learn to score. Then, it scores the answers generated by the model based on reward learning, and enhances the model's capabilities through reinforcement learning.

LangChain is a large language model application development library that hopes to integrate various large language models and create a practical application combined with other knowledge sources or computing capabilities. The LangChain agent will release the entire process of GPT-3 thinking like a thought chain and record the operations.

At this time you will find that the most critical thing is still the LLaMA model weight. Where does it come from?

Hey, go to Meta and apply yourself, ChatLLaMA does not provide it. (Although Meta claims to open source LLaMA, you still need to apply)

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

So essentially, ChatLLaMA is not an open source ChatGPT project, but just a training method based on LLaMA. Several projects integrated in its library were originally open source.

In fact, ChatLLaMA was not built by Meta, but from a start-up AI company called Nebuly AI.

Nebuly AI has made an open source library called Nebullvm, which integrates a series of plug-and-play optimization modules to improve AI system performance.

For example, these are some modules currently included in Nebullvm, including OpenAlphaTensor based on DeepMind's open source AlphaTensor algorithm, optimization modules that automatically sense hardware and accelerate it...

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

ChatLLaMA is also in this series of modules, but it should be noted that its open source license is not commercially available.

So if you want to use the "domestic self-developed ChatGPT" directly, it may not be that simple (doge).

After reading this project, some netizens said that it would be great if someone really got the model weights (code) of LLaMA...

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

But Some netizens also pointed out that the statement "15 times faster than the ChatGPT training method" is purely misleading:

The so-called 15 times faster is just because the LLaMA model itself is very small and can even be used on a single GPU. running on it, but it shouldn't be because of anything done by this project, right?

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

This netizen also recommended a RLHF training method that is better than the one in the library, called trlx, and the training speed is faster than the usual RLHF method. 3~4 times:

The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI

#Have you got the code for LLaMA? What do you think of this training method?

ChatLLaMA address:https://www.php.cn/link/fed537780f3f29cc5d5f313bbda423c4

Reference link:https://www.php.cn/link/fe27f92b1e3f4997567807f38d567a35

The above is the detailed content of The lightweight version of ChatGPT training method is open source! Built around LLaMA in just 3 days, the training speed is claimed to be 15 times faster than OpenAI. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

How to Run LLM Locally Using LM Studio? - Analytics VidhyaApr 19, 2025 am 11:38 AM

Running large language models at home with ease: LM Studio User Guide In recent years, advances in software and hardware have made it possible to run large language models (LLMs) on personal computers. LM Studio is an excellent tool to make this process easy and convenient. This article will dive into how to run LLM locally using LM Studio, covering key steps, potential challenges, and the benefits of having LLM locally. Whether you are a tech enthusiast or are curious about the latest AI technologies, this guide will provide valuable insights and practical tips. Let's get started! Overview Understand the basic requirements for running LLM locally. Set up LM Studi on your computer

Guy Peri Helps Flavor McCormick's Future Through Data TransformationApr 19, 2025 am 11:35 AM

Guy Peri is McCormick’s Chief Information and Digital Officer. Though only seven months into his role, Peri is rapidly advancing a comprehensive transformation of the company’s digital capabilities. His career-long focus on data and analytics informs

What is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaApr 19, 2025 am 11:33 AM

Introduction Artificial intelligence (AI) is evolving to understand not just words, but also emotions, responding with a human touch. This sophisticated interaction is crucial in the rapidly advancing field of AI and natural language processing. Th

12 Best AI Tools for Data Science Workflow - Analytics VidhyaApr 19, 2025 am 11:31 AM

Introduction In today's data-centric world, leveraging advanced AI technologies is crucial for businesses seeking a competitive edge and enhanced efficiency. A range of powerful tools empowers data scientists, analysts, and developers to build, depl

AV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsApr 19, 2025 am 11:30 AM

This week's AI landscape exploded with groundbreaking releases from industry giants like OpenAI, Mistral AI, NVIDIA, DeepSeek, and Hugging Face. These new models promise increased power, affordability, and accessibility, fueled by advancements in tr

Perplexity's Android App Is Infested With Security Flaws, Report FindsApr 19, 2025 am 11:24 AM

But the company’s Android app, which offers not only search capabilities but also acts as an AI assistant, is riddled with a host of security issues that could expose its users to data theft, account takeovers and impersonation attacks from malicious

Everyone's Getting Better At Using AI: Thoughts On Vibe CodingApr 19, 2025 am 11:17 AM

You can look at what’s happening in conferences and at trade shows. You can ask engineers what they’re doing, or consult with a CEO. Everywhere you look, things are changing at breakneck speed. Engineers, and Non-Engineers What’s the difference be

Rocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaApr 19, 2025 am 11:12 AM

Simulate Rocket Launches with RocketPy: A Comprehensive Guide This article guides you through simulating high-power rocket launches using RocketPy, a powerful Python library. We'll cover everything from defining rocket components to analyzing simula

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

SublimeText3 Chinese version

Chinese version, very easy to use

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7587

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

123