2024年8月21日,在遊戲開發者大會「CEDEC 2024」上舉辦了分會場「SIMA:利用電玩遊戲開發通用人工智慧代理」。
此外,透過應用從這些專案中獲得的知識並進行研究,透過結合公司的人工智慧模型 |
SIMA 在研究中使用了遊戲,因為其大多數成員,包括 Mufarek 本人和 Google DeepMind 執行長 Demis Hassabis,都是前遊戲開發人員。他說:「遊戲是我們的 DNA。」他也表示,SIMA 的研究和遊戲開發的共同點比人們想像的要多。 |
History of AI research using gamesMufarek says that games have long contributed to the advancement of AI research and will continue to be the driving force driving research forward. Specifically, games provide AI research with ``rich, dynamic, and complex environments in which people can interact and learn,'' ``scalable and reproducible experiments,'' and ``controlled and safe testing.'' When it comes to rich, dynamic, and complex environments that you can interact with and learn from, the challenges presented in games, such as solving moving puzzles in virtual space, strategizing against opponents, and adapting to changing situations, can be compared to the diverse range of real-world situations. It was explained that AI models can help develop advanced problem-solving skills and decision-making abilities that can be adapted to various situations. For scalable and reproducible experiments, researchers can easily create instances of game environments, run many simulations simultaneously, and use the vast amounts of data they can collect to train and evaluate AI models. was mentioned. Additionally, experiments can be consistently replicated, ensuring the reliability and validity of research results. When it comes to controlled and safe testing, evaluating the performance of an AI model in a variety of virtual situations can help identify potential flaws and limitations and improve algorithms without the risks associated with real-world testing. was shown. This is particularly important for apps such as self-driving cars and medical diagnostics, where errors can have serious consequences. Cases were also shown in which AI research actually progressed through games between 2010 and 2024, when reinforcement learning and deep learning improved dramatically. In the early 2010s, Google DeepMind took on the challenge of developing algorithms using Atari games and DQN (Deep Q-Network). As a result, an algorithm was created that demonstrated superhuman performance when playing over 50 Atari games. In the mid to late 2010s, Microsoft developed an AI training project "Project Malmo" using "Minecraft" . Additionally, OpenAI's AI learning platform "Universe" has a very general-purpose UI, making it possible to scale up the game and use it for research purposes. Also, in the late 2020s, the AI system “OpenAI Five” for “Dota 2” will appear, and the AI agent “AlphaStar” developed by DeepMind will become a top player in “StarCraft II” . AI began to be used even in complex games, such as winning games. During this period, Mufarek focused on a single environment with a customized action space, and created a customized research platform by modifying the game's source code and implementing special APIs for the AI agent. He explained that he had done so. In 2017, the machine learning model "Transformer" announced by Google expanded the versatility of AI, including summarizing dialogue sentences, writing poetry, and analyzing data using large-scale language models (LLMs). This was made possible through chatbots. With further generalization, it has become possible to generate images, audio, and video using AI. However, Mufarek points out the limitations of such large-scale AI models. In other words, large-scale AI models have no physicality, so they only exist in the digital realm and cannot operate in the physical realm. Therefore, in order to utilize AI in the physical domain, it is necessary to give it physicality through physical sensors, such as in Softbank's Pepper and Waymo's self-driving cars. The next chapter of AI research: SIMAAccording to Mufarek, DeepMind has advanced research on SIMA in order to overcome the above-mentioned limitations of AI models. The goal is to ``develop an AI agent that can be conditioned by language.'' In other words, it not only plays games autonomously, but also allows humans to use natural language to tell them what they want them to do. The aim was to create an AI agent that can perform the following. The hypothesis established to achieve this goal is that ``If an AI agent can learn something in one environment and use that skill to do something in another environment, then AI will become generalized.'' will proceed.'' In other words, instead of preparing a dedicated AI agent for each game title, when a human touches a new game, a single AI agent can carry over operations such as characters and cameras from the previous game. This means making it a reality. To this end, DeepMind has partnered with several game companies to create a learning portfolio for AI agents. Specifically, the AI agent was trained by recording human gameplay of games such as ``No Man's Sky,'' ``Valheim,'' ``Teardown,'' and ``Goat Simulator.'' Furthermore, it seems that SIMA was able to be realized by giving text-based instructions. Additionally, onboarding for games and research environments will be done in cooperation with the game's developer. This is to clarify who is responsible for how the data used in the game and SIMA project is handled. According to Mufarek, the SIMA project required a diverse and non-violent learning portfolio. For this reason, we selected a variety of game titles, including those that are visually natural, industrial, realistic, science fiction, or from a first-person or third-person perspective. It also incorporates open world and sandbox elements to allow SIMA to take various actions through complex mechanisms. SIMA uses a general-purpose interface, which is said to be in order to create a general-purpose AI agent. SIMA first receives goals and instructions from humans in the form of text written in natural language, and then recognizes them in real time. Then, just like humans, they play games using a controller or keyboard and mouse. Mufarek explained that by using such a general-purpose interface, SIMA can be incorporated into any game without customization. Additionally, two methods were used to create SIMA training data. One is for a single person to play the game, watch the video, and annotate important points using natural language. The second method involves teams of two people, with one person giving instructions in natural language and the other person following them, filming a gameplay video and adding annotations. The SIMA data set is the addition of keyboard and mouse operation data. These datasets include skills necessary for SIMA gameplay, such as ``creating objects'' and ``driving a car'' in the game. As a result of collecting these skills for all titles, the total number is huge, but it is still not enough for the SIMA project. Mr. Mufarek said that the higher the quality of data and annotation, the more useful it will be for improving SIMA, and that he will continue to make such efforts. Once the dataset is ready, SIMA learning training can finally begin. The technique used here is ``conditioned behavioral cloning,'' which involves learning by imitating human play. At its core is an architecture that supports pre-trained models, but since Gemini did not yet exist when it was developed, it uses Classifier-Free Guidance (CFG) to prioritize verbal instructions over visual input. It was revealed that the company helped the children learn to understand natural language and helped them understand natural language well. In the phase to evaluate SIMA's results, a challenge set was created to measure performance on various tasks. A task has three elements: the first is the "initial state" where SIMA starts its actions, the second is the "goal/instruction" that SIMA must follow, and the third is "the initial state" that determines whether or not the task has been accomplished. success criteria." SIMA also uses ``Ground Truth,'' which programmatically determines whether a task has been completed successfully, ``Optical Character Recognition (OCR),'' which provides feedback on actions taken based on changes in text on the screen, and human It was also introduced that evaluation will be done from three perspectives: ``human evaluation,'' which involves checking the video and confirming whether the task was completed successfully. SIMA early research results and limitations of this approachEarly research results of the project revealed that SIMA can complete tasks commonly performed in a variety of games, such as "moving forward" and "opening a menu." They were also able to successfully complete tasks that could have different meanings from game to game, such as taking off a spaceship in ``No Man's Sky'' or piloting a boat in ``Teardown.'' On the other hand, whether the players were able to complete the tasks specific to each game was evaluated using three separately prepared methods. One is ``Specialist,'' which is trained on data from a single game and evaluated in the same environment, and this is considered 100% performance as the baseline for evaluation. The second is ``SIMA,'' which trains data from 10 games and then tests and evaluates it in the environment of one of the games. The third one is ``Zero-Shot,'' which trains data from 9 out of 10 titles and tests and evaluates it in the game environment of the remaining 1 title. As a result, SIMA demonstrated higher performance than Specialist when learning all 10 titles, and performance close to Specialist even with Zero-Shot. In other words, Mr. Mufarek was very satisfied because he was able to confirm that ``an AI agent can learn something in one environment and use that skill to do something in another environment.'' . However, the goal of this project is to "develop an AI agent that is conditioned by language." Therefore, when learning and testing was performed without natural language annotations, SIMA's performance deteriorated significantly. For the first time, the hypothesis that ``training a single agent in many large-scale environments results in transfer of learning and generalization'' was proven. However, he said that this is what motivates him to do SIMA research going forward. Until now, we have been researching learning to improve the performance of AI agents, but for example, due to updates to "StarCraft II", AlphaStar's performance has deteriorated. Mufarek said, ``It's not realistic to have the AI agent retrain every time the game is updated,'' and believes that by making SIMA more general-purpose, the AI agent will be able to perform well even when new features are added to the game. spoke. Also, SIMA is good at tasks that can be completed in a short time, such as "gathering firewood" and "setting the firewood on fire," but it is not always good at tasks that require planning, multiple steps, and reasoning, such as "building a house." That's not the case. However, now it seems that Gemini can be a powerful support for SIMA. For example, Gemini can become a director and divide a long task like ``building a house'' into short tasks and hand them over to SIMA. Ta. Mr. Mufarek reiterated that while the SIMA project is very exciting and promises great versatility, it has not yet become a fully general-purpose AI agent. If that happens, further developments will become possible.'' |
以上是Google DeepMind 用於 3D 虛擬環境的通用 AI 代理「SIMA」是什麼? [CEDEC 2024]的詳細內容。更多資訊請關注PHP中文網其他相關文章!