Home >Technology peripherals >AI >AI writes its own code to allow the agent to evolve! OpenAI's large model tastes like 'human thought”

AI writes its own code to allow the agent to evolve! OpenAI's large model tastes like 'human thought”

王林
王林forward
2023-04-09 18:21:041124browse

Make trouble!

The AI ​​"looked" at how humans submitted updates (commits) on GitHub, and then imitated human programmers to modify the code...

In the end, the AI ​​was successfully "trained" An intelligent robot was born:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

No kidding, this kind of scary thing actually happened in a recent study released by OpenAI …

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

Originally, what researchers wanted to solve was a genetic programming (GP) problem—making an intelligent robot learn to move.

(GP is a special field in evolutionary computing. It is mainly aimed at automatically building programs to solve problems independently.)

But OpenAI takes a different approach and uses its own large-scale language model (LLM) ) was put in, and the result was a big "never expected".

In the past, in the process of intelligent agent evolution, human researchers needed to participate in making some detailed adjustments and determining the direction of evolution, so that the intelligent agent could develop in a good direction.

Now, all these tasks are taken care of by the big model. You can learn, write your own code, and "tune" yourself:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

This As soon as Joel Lehman, the author of the paper, was exposed on the Internet, it instantly attracted a lot of attention from netizens:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

After reading it, a programmer netizen said, "I can't keep up ( The pace of technology) development has:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

Even OpenAI itself said in research:

Bridges the gap between evolutionary algorithms operating at the level of human thought.

So how did AI accomplish this "magical" thing?

Take a look at GitHub, AI types the code by itself

Designing movable robots in a virtual environment is a very popular project in genetic algorithm research.

In particular, the Sodarace competition is very popular because it requires a small amount of calculations and facilitates visualization of the process.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The rules are simple. Robots composed of "joints" and "muscles" race on various terrains.

OpenAI also deliberately rewritten the entire competition program from dedicated genetic coding to a Python version in order to demonstrate the versatility of the new method to modern programming languages.

For example, this piece of Python code can be used as the initial seed robot.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

After defining the four vertex joints and end joints of a square and connecting them with "muscles", the result is as follows.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

#However, such a square structure cannot move at all. Next, the code needs to be modified by genetic algorithm.

The research team believes that there are still two gaps in efficiency between using traditional genetic algorithms to modify code versus human programmers doing it themselves:

One is that software is becoming more and more complex, and humans can create modules However, the most advanced genetic algorithms currently cannot do this in programming languages ​​used by humans.

The other is that almost all genetic algorithms rely on random mutations, and every time human programmers modify the code, they have a purpose, either to add functionality, to improve efficiency, or to repair it. bug.

So is there a way for AI to learn how humans modify the code?

Yes, the required training data is all stored on GitHub.

Excellent programmers will write a commit description every time they submit code, explaining clearly what has been modified in this submission.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The commit description combined with the diff data comparing the code before and after submission is an excellent learning material for AI.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The researchers screened out some submitted data with clear descriptions and small amounts of modified code to train a GPT-3 architecture AI model.

It is equivalent to letting AI learn from human programmers how to modify a piece of code purposefully.

The model used in this paper does not need to be as large as the 175 billion parameters of the full version of GPT-3, and a maximum of 750 million parameters is enough.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The basic AI model is thus obtained, which will play the role of mutation operator in the genetic algorithm.

The next process of letting AI design a new robot is divided into three steps.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The first step is to use the classic MAP-Elites algorithm to generate an initial set of robots.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

This is a QD (Quality Diversity) algorithm that ensures that robots behave differently and are all of high quality.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The second step is to use the initial data generated in the first step for pre-training, so that the AI ​​can first learn to design robots within the training data distribution.

That’s the animated picture at the beginning that shocked everyone on the Internet, showing how AI transforms an immovable “block” into a mobile robot with alternating bouncing legs step by step.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

The third step is to fine-tune the reinforcement learning algorithm so that the AI ​​can generate robots that can adapt to the environment according to different terrain conditions.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

In the end, the researchers selected robots that evolved from the first three seeds to demonstrate the effects.

It can be seen that their structure and movement methods are completely different.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

Netizens exclaimed “the thinking is so clear”

Once this study was announced, it can be said that it caused thousands of waves with one stone.

Many netizens are amazed by this novel way of combining "large model evolution algorithm":

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

Researchers who have done related work also said , I never thought that I could use a large model to learn mutations in the form of diffs:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

In addition to the discussion of the research model and itself, some netizens also added this Picture:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

Emmm...it’s a bit like that.

Team Introduction

The team members of this research are all from OpenAI.

The first author of the paper is Joel Lehman, a machine learning scientist. Its areas of focus include artificial intelligence security, reinforcement learning and open search algorithms.

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

At the same time, Joel Lehman previously co-wrote a scientific book "Why Greatness Cannot Be Planned: The Secret of Objectiveness" based on his thoughts on the development of artificial intelligence:

AI writes its own code to allow the agent to evolve! OpenAIs large model tastes like human thought”

As for the next step of this research, Joel Lehman himself said:

There is another important question, which is the extent to which the model can be applied to other environments.

The efficacy of mutations in GP can now be greatly improved by ELM, which will inspire a wide range of new applications and research directions.

So has this research also given you new inspiration?

Reference link:

[1]https://arxiv.org/abs/2206.08896

[2]https://twitter. com/joelbot3000/status/1538770905119150080?s=21&t=l8AASYjgC6RAEEimcQaFog

The above is the detailed content of AI writes its own code to allow the agent to evolve! OpenAI's large model tastes like 'human thought”. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete