Home >Technology peripherals >AI >Xishanju AI technical expert Huang Hongbo: Practical integration of reinforcement learning and behavior trees in games
On August 6th and 7th, 2022,AISummit Global Artificial Intelligence Technology Conferencewill be held as scheduled. At the "Artificial Intelligence Frontier Exploration" sub-forum held on the afternoon of the 7th, Xishanju AI technical expert Huang Hongbo brought a theme sharing of "Practical Combination of Reinforcement Learning and Behavior Trees in Games" and shared in detail the impact of reinforcement learning in the game field. value.
Huang Hongbo said that the implementation of reinforcement learning technology does not lie in changing the algorithm to be more powerful, but in combining reinforcement learning technology with deep learning and game planning to form a complete set of solutions and Make it happen.
Reinforcement learning makes the game smarter
The implementation of reinforcement learning in the game can make the game smarter and more playable. This is The main purpose of using reinforcement learning in games.
"Reinforcement learning is a machine learning paradigm that trains the agent's strategy so that a series of decisions can be made." Huang Hongbo said that the purpose of the agent is to output actions based on observations of the environment. These actions will lead to more observations and rewards. Training involves a lot of trial and error as the agent interacts with the environment, and the strategy can be improved with each iteration.
In a game, the agent that takes action or performs a behavior is the game agent. Consider a character or a robot in a game, it has to understand the state of the game, where the player is, and then based on this observation, it should make decisions based on the situation of the game. In reinforcement learning, decisions are driven by rewards, which can be provided in the game as high scores or for reaching new levels to reach specific goals.
Huang Hongbo said that the coolest thing about the game situation is that the agent’s strategy is trained under the pressure of the game. For example, it could learn how to handle an attack, or how to behave to achieve a specific goal.
The role of behavior tree in the game
The behavior tree is a tree structure containing logical nodes and behavior nodes. Usually, you can abstract each situation into a type of node, write the nodes according to the specifications, and then connect these nodes into a tree. Every time the user looks for a behavior, he will start from the root node of the tree and find a behavior consistent with the current data from each node.
To put it simply, when the coupling degree of each AI module is high and the granularity is large, a change often involves a large number of modifications, and it is easy for a large amount of duplicate code to appear. The emergence of behavior trees has provided a "square notebook" for the majority of game developers, allowing AI developers to more conveniently build a set of AI frameworks that are reusable, easy to expand and maintain. It can be said that reinforcement learning is obtained through training, and the behavior tree is a combination of several else and if statements.
As shown in the picture above, there is a root node in the picture, and there is a tree node below. The tree nodes include escape, attack, wandering, etc. Think of the picture above as an AI or robot and let it patrol the jungle. When the AI sees an ORC orc and determines that it cannot defeat the ORC, when this condition is triggered, the AI will run away and execute the Run action when escaping. When it is judged that it is easier to fight, the Fight operation will be performed.
In the above figure, there are two nodes, one is Root, which is the root node; one is the Selector node, which is the logical node. All nodes are executed in a certain order from left to right. This is a behavior tree. Therefore, you only need to write the corresponding logic in each node to allow the AI to perform some related actions. Several behavior trees finally form a game.
The combination of reinforcement learning and behavior trees makes the game richer
How to use the combination of reinforcement learning and behavior trees to make the game richer? This is a difficult application that needs to be discussed in many games.
Before that, we might as well discuss when it is better to use reinforcement learning and under what circumstances it is better to use behavior trees. Huang Hongbo said that if there is no way to achieve the goal using behavior trees, reinforcement learning can be used. For example, in FPS (first-person shooter games), how much firepower should be used, who should be fired at, what kind of weapons should be used, etc. It is more difficult to make decisions through behavior trees. Generally speaking, it is better to use reinforcement learning.
When to use behavior trees? For example, if you encounter an obstacle in the game and need to jump over it, you can choose to use reinforcement learning to do it, or you can choose to use a behavior tree to do it. But if we use reinforcement learning to do it, training will be very troublesome. Since there is only one option in this situation, which is to skip, it is simpler to use a behavior tree.
It is not difficult to find that if reinforcement learning and behavior trees are combined and used in games, it is a better solution. Huang Hongbo said that there are two relatively large implementation methods for combining reinforcement learning with behavior trees: one is based on reinforcement learning and supplemented by behavior trees; the other is based on behavior trees and supplemented by reinforcement learning.
Behavior tree side: With behavior tree as the main AI movement method, the behavior tree receives obs input from the game client, and writes corresponding behavior tree behaviors for obs according to its own target situation. In each behavior of the behavior tree, some nodes that require reinforcement learning to make decisions are handed over to reinforcement learning. Then here, reinforcement learning is required to perform corresponding training for some specific scenarios.
Reinforcement learning side: The overall strategy becomes to train several models, each model executes a strategy, and then is embedded into the behavior tree.
Huang Hongbo said that among these two different implementation methods, which one is better requires different considerations based on different situations, different applications, and different games, so it cannot be generalized.
In the following time, Huang Hongbo introduced in detail the technical framework adopted by Xishanju in reinforcement learning and behavior trees, and combined with a large number of game cases, introduced in detail How behavior trees and reinforcement learning are combined in the game to make the game richer. Users who are interested in case practice may wish to pay attention to the wonderful sharing videos of the AISummit Global Artificial Intelligence Technology Conference. (https://www.php.cn/link/53253027fef2ab5162a602f2acfed431)
The above is the detailed content of Xishanju AI technical expert Huang Hongbo: Practical integration of reinforcement learning and behavior trees in games. For more information, please follow other related articles on the PHP Chinese website!