Home  >  Article  >  Technology peripherals  >  What is the use of letting AI learn to beat the king?

What is the use of letting AI learn to beat the king?

王林
王林forward
2023-04-11 19:28:101166browse

On November 28, NeurIPS 2022 officially opened.

As one of the most prestigious artificial intelligence events in the world, NeurIPS is the focus of attention in the field of computer science at the end of every year. Papers accepted by NeurIPS represent the highest level of current neuroscience and artificial intelligence research, and also reflect changes in industry trends.

What’s interesting is that this year’s “contestants” seem to have a special liking for “games” in their research.

For example, Li Feifei’s team’s MineDojo, based on the Minecraft game environment, won the best data set and benchmark paper awards. Relying on the openness of the game, researchers can train agents through various types of tasks in MineDojo, thereby giving AI more general capabilities.

What is the use of letting AI learn to beat the king?

And through the strict admission rate, another paper also included in the field of gaming may be relevant to many gamers.

After all, who hasn’t played King of Kings?

What is the use of letting AI learn to beat the king?

Paper "Arena: A Generalization Environment for Competitive Reinforcement Learning"

Address: https://openreview.net/pdf?id=7e6W6LEOBg3

In the article, the researchers proposed a game based on the MOBA game "The King of Kings" Glory” test environment. The purpose is actually similar to MineDojo - to train AI.

Why are MOBA game environments so popular?

Since DeepMind launched AlphaGo, games, as a simulated environment with high degree of freedom and high complexity, have long become an important choice for AI research and experiments.

However, compared to humans who can continuously learn from open-ended tasks, agents trained in lower-complexity games cannot generalize their abilities. to specific tasks. To put it simply, these AIs can only play chess or play ancient Atari games.

In order to develop AI that can be more "general-purpose", the focus of academic research has gradually shifted from board games to more complex games, including non-perfect information game games (such as Poker) and strategy games (such as MOBA and RTS games).

At the same time, as Li Feifei’s team said in the award-winning paper, in order for the agent to be able to generalize to more tasks, the training environment needs to provide enough tasks .

What is the use of letting AI learn to beat the king?

DeepMind, which relied on AlphaGo and its derivative version AlphaZero to defeat all the invincible players in the Go circle, quickly realized this.

#In 2016, DeepMind teamed up with Blizzard to launch the "StarCraft II Learning Environment" based on "StarCraft II" with a space complexity of 10 to the power of 1685. Environment, SC2LE), provides researchers with specifications for agent actions and rewards, and an open source Python interface for communicating with game engines.

What is the use of letting AI learn to beat the king?

There is also an "AI training ground" with excellent qualifications in China——

As In the well-known MOBA game, the player's action state space in "Honor of Kings" is as high as 10 to the 20,000th power, which is far larger than Go and other games, and even exceeds the total number of atoms in the entire universe (10 to the 80th power).

Like DeepMind, Tencent’s AI Lab also teamed up with “Honor of Kings” to jointly develop the “Honor of Kings AI Open Research Environment” that is more suitable for AI research.

What is the use of letting AI learn to beat the king?

Currently, the "Glory of Kings AI Open Research Environment" includes a 1v1 battle environment and baseline algorithm model, and supports mirror battle tasks for 20 heroes. and non-mirror battle missions.

Specifically, the "Glory of Kings AI Open Research Environment" can support 20×20=400 battle sub-tasks when only considering the selection of heroes from both sides. If you include summoner skills, there will be 40,000 seed quests.

In order to let everyone better understand the generalization challenges that the agent accepts in the "Glory of Kings AI Open Research Environment", we can use the two tests in the paper to Verify:

What is the use of letting AI learn to beat the king?

First make a behavior tree AI (BT) whose level is entry-level "gold". The opposite is the agent (RL) trained by the reinforcement learning algorithm.

In the first experiment, only Diao Chan (RL) and Diao Chan (BT) were allowed to fight, and then the trained RL (Diao Chan) was used to challenge different heroes (BT). .

The results after 98 rounds of testing are shown in the figure below:

When the opponent hero changes, the performance of the same training strategy drops sharply decline. Because changes in opponent heroes make the test environment different from the training environment, the strategies learned by existing methods lack generalization.

What is the use of letting AI learn to beat the king?

Figure 1 Generalization challenge across opponents

In the second In this experiment, only Diao Chan (RL) and Diao Chan (BT) were allowed to fight, and then the trained RL model was used to control other heroes to challenge Diao Chan (BT).

The results after 98 rounds of testing are as shown below:

When the target controlled by the model changes from Diao Chan to other heroes, the same The performance of the training strategy drops sharply. Because the change in target hero makes the meaning of the action different from Diao Chan's actions in the training environment.

What is the use of letting AI learn to beat the king?

Figure 2 Cross-target generalization challenge

Causes this result The reason is very simple. Each hero has its own unique operating skills. After a single-trained agent gets a new hero, it doesn't know how to use it, so it can only turn a blind eye.

The same goes for human players. Players who can "kill randomly" in the middle may not be able to achieve a good KDA after changing to the jungle.

It is not difficult to see that this actually goes back to the question we raised at the beginning. It is difficult to train "universal" AI in a simple environment. MOBA games with high complexity just provide an environment that is convenient for testing the generalization of the model.

Of course, the game cannot be used directly to train AI, so a specially optimized "training ground" came into being.

Thus, researchers can test and train their own models in environments such as the "StarCraft II Learning Environment" and the "Glory of Kings AI Open Research Environment."

How do domestic researchers access appropriate platform resources?

The development of DeepMind is inseparable from the strong support of Google. MineDojo proposed by Li Feifei's team not only uses the resources of Stanford, a top university, but also has strong support from NVIDIA.

The current domestic artificial intelligence industry is still not solid enough at the infrastructure level, especially for ordinary companies and universities, which are facing a shortage of research and development resources.

In order to allow more researchers to participate, Tencent officially opened the "Honor of Kings AI Open Research Environment" to the public on November 21 this year.

Users only need to register an account on the official website of Enlightenment Platform, submit information and pass the platform review to download it for free.

What is the use of letting AI learn to beat the king?

## Website link: https://aiarena.tencent.com/aiarena/zh/open-gamecore

It is worth mentioning that in order to better support scholars and algorithm developers in their research, the Enlightenment Platform not only encapsulates the "Honor of Kings AI Open Research Environment" for ease of use, but also provides Standard code and training framework.

What is the use of letting AI learn to beat the king?

Next, let’s have a “shallow” experience on how to start an AI training project on the Enlightenment Platform!

Since we want AI to "play" "Honor of Kings", the first thing we have to do is to make the "intelligent agent" used to control the hero.

Sounds a bit complicated? However, in the "Glory of Kings AI Open Research Environment", this is actually very simple.

First, start the gamecore server:

cd gamecoregamecore-server.exe server --server-address :23432

Install the hok_env package:

git clone https://github.com/tencent-ailab/hok_env.gitcd hok_env/hok_env/pip install -e .

and run Test script:

cd hok_env/hok_env/hok/unit_test/python test_env.py

Now, you can import hok and call hok.HoK1v1.load_game to create the environment:

import hok
env = HoK1v1.load_game(runtime_id=0, game_log_path="./game_log", gamecore_path="~/.hok", config_path="config.dat",config_dicts=[{"hero":"diaochan", "skill":"rage"} for _ in range(2)])

Following, We obtain our first observation from the agent by resetting the environment:

obs, reward, done, infos = env.reset()

obs is a list of NumPy arrays describing the agent's response to the environment observation.

reward is a list of floating point scalars describing the immediate reward received from the environment.

done is a Boolean list describing the state of the game.

infosThe variable is a tuple of dictionaries whose length is the number of agents.

Then perform operations in the environment until time runs out or the agent is killed.

Here, just use the env.step method.

done = False
while not done:
action = env.get_random_action()
obs, reward, done, state = env.step(action)

Like the "StarCraft II Learning Environment", you can also use visualization tools to view the replay of the agent in the "Glory of Kings AI Open Research Environment".

At this point, your first agent has been created.

Next, you can drag "her/him" to perform various trainings!

What is the use of letting AI learn to beat the king?

# Speaking of this, it is probably not difficult for everyone to find that the "Glory of Kings AI Open Research Environment" is not just a training environment The AI ​​environment makes the entire process simple and easy to understand through familiar operations and rich documentation.

This will allow more people who are interested in entering the AI ​​field to get started easily.

Game AI, what other possibilities are there?

Seeing this, there is actually a question that remains unanswered - as a research platform led by enterprises, why does Tencent Enlightenment Platform choose to open it up on a large scale?

In August this year, the Chengdu Artificial Intelligence Industry Ecological Alliance and the think tank Yuqian Consultants jointly released the country’s first game AI report. It is not difficult to see from the report that games are one of the key points in promoting the development of artificial intelligence. Specifically, games can improve the application of AI in three aspects.

What is the use of letting AI learn to beat the king?


First of all, the game is an excellent training and testing ground for AI.

  • Rapid iteration: The game can be interacted with and tried and made at will, without any real cost. At the same time, there is an obvious reward mechanism, which can fully demonstrate the effectiveness of the algorithm.
  • Rich tasks: There are many types of games with various difficulties and complexities. Artificial intelligence must adopt complex strategies to deal with them. Conquering different types of games reflects the improvement of algorithm level.
  • Clear success or failure criteria: Calibrate the ability of artificial intelligence through game scores to facilitate further optimization of artificial intelligence.

Secondly, games can train different abilities of AI and lead to different applications.

For example, chess games train AI to make sequence decisions and gain long-term deduction capabilities; card games train AI to dynamically adapt and gain adaptability; real-time strategy games train AI to machine memory capabilities , long-term planning capabilities, multi-agent collaboration capabilities, and action coherence.

In addition, the game can also break environmental constraints and promote intelligent decision-making.

For example, games can promote virtual simulation real-time rendering and virtual simulation information synchronization, and upgrade virtual simulation interactive terminals.

What is the use of letting AI learn to beat the king?

The enlightenment platform relies on the advantages of Tencent AI Lab and King of Glory in terms of algorithms, computing power, complex scenarios, etc. After it is opened, it can Build a bridge of effective cooperation between games and AI development, linking university discipline construction, competition organization, and industry talent incubation. When the talent pool is sufficient, scientific research progress and commercial applications will spring up like mushrooms after a rain.

In the past two years, the Kaiwu Platform has taken many layout measures in the field of industry, academia and research: it held the "Kaiwu Multi-Agent Reinforcement Learning Competition", which attracted TOP2 people including Qingbei A group of top university teams, including prestigious universities, participated; a university science and education consortium was formed. The School of Information Science and Technology of Peking University launched a popular elective course "Algorithms in Game AI". The after-school homework was to conduct experiments in the Honor of Kings 1V1 environment...

Looking forward to the future, we can expect that these talents who have gone global with the help of the "Enlightenment" platform will radiate into various fields of the AI ​​industry and realize the full bloom of the platform's upstream and downstream ecology.

The above is the detailed content of What is the use of letting AI learn to beat the king?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete