Home >Technology peripherals >AI >Someone finally made it clear about the current situation of GPT! OpenAI's latest speech went viral, and it must be a genius hand-picked by Musk
Following the release of Windows Copilot, the Microsoft Build conference was detonated by a speech.
Former Tesla AI Director Andrej Karpathy believed in his speech that tree of thoughts is similar to AlphaGo’s Monte Carlo Tree Search (MCTS) How wonderful!
Netizens shouted: This is the most detailed and interesting guide on how to use large language models and GPT-4 models!
In addition, Karpathy revealed that LLAMA 65B is “significantly more powerful than GPT-3 175B” due to the expansion of training and data, and introduced large models Anonymous Arena ChatBot Arena:
Claude's score is between ChatGPT 3.5 and ChatGPT 4.
Netizens said that Karpathy’s speeches are always great, and the content this time did not disappoint everyone as always.
What went viral with the speech was a note compiled by Twitter netizens based on the speech. There are 31 notes, and the number of likes has exceeded 3,000:
So, what specific content was mentioned in this much-anticipated speech?
Karpathy’s speech this time is mainly divided into two parts.
Part One, he talked about how to train a "GPT Assistant".
Karpathy mainly talks about the four training stages of AI assistant:
pre-training, supervised fine tuning, reward modeling and reinforcement learning ).
Each stage requires a data set.
In the pre-training stage, a large amount of computing resources need to be used to collect a large number of data sets. A basic model is trained on a large unsupervised data set.
Karpathy used more examples to supplement:
Then we enter the fine-tuning stage.
Using a smaller supervised data set, fine-tuning this base model through supervised learning can create an assistant model that can answer the question.
He also showed the evolution process of some models. I believe many people have seen the above "evolutionary tree" picture before.
Karpathy believes that the best open source model currently is Meta’s LLaMA series (because OpenAI has not open sourced anything about GPT-4).
What needs to be clearly pointed out here is that the base model is not an assistant model.
Although the basic model has the ability to solve problems, the answers it gives are not trustworthy, while the assistant model can provide reliable answers. The supervised fine-tuned assistant model is trained on the basis of the basic model, and its performance in generating replies and understanding text structure will be better than that of the basic model.
Reinforcement learning is another key process when training language models.
Using high-quality manually annotated data during the training process, and creating a loss function in a reward modeling manner to improve its performance. Reinforcement training can be achieved by increasing the probability of positive marking and decreasing the probability of negative marking.
Human judgment is critical to improving AI models when it comes to creative tasks, and models can be trained more effectively by incorporating human feedback.
After reinforcement learning with human feedback, a RLHF model can be obtained.
After the models are trained, the next step is how to effectively use these models to solve problems.
In Part 2, Karpathy discusses prompting strategies, fine-tuning, the rapidly evolving tool ecosystem, and future expansion.
Karpathy gave another specific example to illustrate:
When writing, we need to perform a lot of mental activities , including considering whether your expression is accurate. For GPT, this is merely a sequence of tokens being tagged.
And prompt can make up for this cognitive gap.
Karpathy further explained how Thought Chain prompts work.
For reasoning problems, if you want Transformer to perform better in natural language processing, you need to let it process information step by step instead of directly throwing it a very complex problem.
If you give it a few examples, it will imitate the template of this example, and the final result will be better.
The model can only answer questions according to its sequence. If the content it generates is wrong, you can prompt it and let It regenerates.
If you don't ask it to check, it won't check it by itself.
This involves the problem of System1 and System2.
Nobel Prize winner in economics Daniel Kahneman proposed in "Thinking Fast and Slow" that the human cognitive system consists of two subsystems, System1 and System2. System1 relies mainly on intuition, while System2 is a logical analysis system.
In layman's terms, System1 is a fast and automatically generated process, while System2 is a well-thought-out part.
This was also mentioned in a recent popular paper "Tree of thought".
Thoughtful means not simply giving an answer to a question, but more like a prompt used with Python glue code, incorporating many prompts are concatenated together. In order to scale the hints, the model needs to maintain multiple hints and perform a tree search algorithm.
Karpathy believes that this idea is very similar to AlphaGo:
When AlphaGo plays Go, it needs to consider where to place the next piece. Initially it learned by imitating humans.
In addition to this, it implements a Monte Carlo tree search to obtain results with multiple potential strategies. It evaluates many possible moves and retains only those that are better. I think this is somewhat equivalent to AlphaGo.
In this regard, Karpathy also mentioned AutoGPT:
I think its effect is not very good at present, and I do not recommend its practical application. I think we might be able to learn from its evolution over time.
Secondly, another little trick is to retrieve enhanced generation (retrieval agumented generation) and effective prompts.
The content of the window context is the working memory of transformers at runtime. If you can add task-related information to the context, then it will perform very well because it can be accessed immediately these messages.
In short, it means that relevant data can be indexed so that the model can be accessed efficiently.
# Transformers will perform better if they also have a main file to reference.
Finally, Karpathy briefly talked about constraint prompting and fine-tuning in large language models.
Large language models can be improved through constraint hints and fine-tuning. Constraint hints enforce templates in the output of large language models, while fine-tuning adjusts the model's weights to improve performance.
I recommend using large language models in low-risk applications, always combining them with human supervision, treating them as a source of inspiration and advice, and considering copilots rather than making them completely autonomous acting.
Dr. Andrej Karpathy’s first job after graduation was to study computer vision at OpenAI .
Later, Musk, one of the co-founders of OpenAI, fell in love with Karpathy and hired him at Tesla. Musk and OpenAI were at odds over the matter, and Musk was eventually excluded. Karpathy is responsible for Tesla's Autopilot, FSD and other projects.
In February of this year, seven months after leaving Tesla, Karpathy joined OpenAI again.
Recently he tweeted that he is currently very interested in the development of the open source large language model ecosystem, which is a bit like signs of the early Cambrian explosion.
Portal:
[1]https://www.youtube. com/watch?v=xO73EUwSegU (speech video)
[2]https://arxiv.org/pdf/2305.10601.pdf ("Tree of thought" paper)
Reference link:
[1]https://twitter.com/altryne/status/1661236778458832896
[2]https://www.reddit.com/r/MachineLearning/comments/13qrtek/n_state_of_gpt_by_andrej_karpathy_in_msbuild_2023/
#[ 3]https://www.wisdominanutshell.academy/state-of-gpt/
The above is the detailed content of Someone finally made it clear about the current situation of GPT! OpenAI's latest speech went viral, and it must be a genius hand-picked by Musk. For more information, please follow other related articles on the PHP Chinese website!