CMU and ETH Zurich teams collaborated to develop a new framework called "Agile But Safe" (ABS), which provides a solution for quadruped robots to achieve high-speed movement in complex environments. The framework not only shows high efficiency in avoiding collisions, but also achieves an unprecedented 3.1 millisecond speed. This innovation brings new progress to the field of legged robots.
In the field of high-speed robot motion, maintaining speed and safety at the same time has always been a huge challenge. However, a research team at Carnegie Mellon University (CMU) and ETH Zurich (ETH) recently achieved a breakthrough. The new quadruped robot algorithm they developed can not only move quickly in complex environments, but also skillfully avoid obstacles, truly achieving the goal of "agility and safety". The innovation of this algorithm lies in its ability to quickly identify and analyze the surrounding environment and make intelligent decisions based on real-time data. By using advanced sensors and powerful computing power, the robot is able to accurately sense obstacles around it and avoid them by adjusting its gait and trajectory. The successful application of this technology will greatly promote the development of high-speed robots
Paper address: https://arxiv.org/pdf/2401.17583.pdf
With the support of ABS, the robot dog has demonstrated amazing high-speed obstacle avoidance capabilities in various scenarios:
Narrow corridors with many obstacles:
Messy indoor scenes:
Whether it is grass or outdoors, static or dynamic obstacles, the robot dog can handle them calmly:
When encountering a stroller, the robot dog dodges nimbly:
Warning signs, boxes, and chairs are also not a problem:
It can also easily bypass the sudden appearance of mats and human feet:
The robot dog can even play eagle and catch chickens:
ABS Breakthrough Technology:
RL Learning model-free Reach-Avoid value
ABS adopts a dual policy (Dual Policy) setting, including an "Agile Policy" (Agile Policy) and a "Recovery Policy" (Recovery Policy). Agility strategies allow the robot to move quickly through obstacles, while recovery strategies step in to ensure the safety of the robot once Reach-Avoid Value Estimation detects potential dangers (such as a sudden appearance of a stroller).
Innovation point 1: How to train an agile policy Agile Policy? The innovation of the agile strategy is that, unlike the past simple tracking of speed instructions, it uses the form of goal achievement (position trakcing) to maximize the robot's Agility. This strategy trains the robot to develop sensorimotor skills to achieve a specified goal without collisions. By pursuing the reward condition of high base speed, the robot naturally learns to achieve maximum agility while avoiding collisions. This method overcomes the possible conservative limitations of traditional velocity tracking strategies in complex environments and effectively improves the speed and safety of robots in obstacle environments. Agile Policy reached a maximum speed of 3.1m/s in the real machine test##Innovation point 2: Learning Policy-conditioned reach-avoid value The innovation of "Reach-Avoid" (RA) value learning is that it adopts a model-free approach to learning, which is different from traditional The reachability analysis method of the model is different and is more suitable for the model-free reinforcement learning strategy. Rather than learning a global RA value, this approach makes it dependent on a specific strategy, which can better predict the failure of an agile strategy. With a simplified set of observations, the RA value network can effectively generalize and predict safety risks. The RA value is used to guide recovery strategies and help the robot optimize its motion to avoid collisions, thereby achieving the goal of improving agility while ensuring safety.
The following figure shows the RA (reach defense) value learned for a specific set of obstacles. As the robot speed changes, the distribution landscape of RA values changes accordingly. The sign of the RA value is a reasonable indication of the safety of the agile strategy. In other words, this graph shows the safety risk of the robot when facing specific obstacles at different speeds through different RA values. The high and low changes in the RA value reflect the safety risks that the robot may encounter when executing agile strategies in different states.
Innovation point 3: Use Reach-Avoid Value and recovery strategy to save the robotThe innovation of the recovery strategy is that, It enables quadruped robots to quickly track linear and angular velocity commands as a backup protection strategy. Unlike the agile strategy, the observation space of the recovery strategy focuses on tracking linear velocity and angular velocity commands and does not require external sensory information. The recovery strategy's mission rewards focus on linear velocity tracking, angular velocity tracking, staying alive, and maintaining posture to allow for a smooth switch back to the agility strategy. The training of this strategy is also performed in a simulation environment, but with specific domain randomization and curriculum settings to better adapt to the conditions that may trigger the recovery strategy. This approach provides quadruped robots with the ability to quickly respond to potential failures during high-speed motion.
The figure below shows a visual representation of the RA (defense) value landscape when the recovery strategy is triggered in two specific situations (I and II). These visualizations are performed in the vx (velocity along the x-axis) versus ωz (angular velocity about the z-axis) and vx versus vy (velocity along the y-axis) planes. The figure shows the initial rotation state before the search (that is, the current rotation state of the robot base) and the commands obtained through the search. Simply put, these charts show the optimal movement instructions obtained through the recovery strategy search under specific conditions, and how these instructions affect the RA value, thus reflecting the safety of the robot in different movement states.
The author wrote in "12kg load/basketball The robustness of the ABS framework was tested in four scenarios: impact/kick/snow, and the robot dog responded calmly: This research was jointly completed by the research teams of CMU and ETH. Team members include Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu and Guanya Shi. Their cooperation not only achieved major breakthroughs in the field of robotics, but also opened up new application possibilities for quadruped robots. The success of this technology demonstrates the huge potential of quadruped robots in the fields of high-speed movement and safe obstacle avoidance. In the future, this high-speed and safe quadruped robot is expected to play an important role in many fields such as search and rescue, exploration and even home services.
The above is the detailed content of CMUÐ achieves a breakthrough: the robot dog has a full agility value, can cross obstacles at super high speed, and has both speed and safety!. For more information, please follow other related articles on the PHP Chinese website!