Where has the 'embodied intelligence” that Li Feifei focused on reached?-AI-php.cn

Home

Technology peripherals

Where has the 'embodied intelligence” that Li Feifei focused on reached?

王林

Apr 17, 2023 pm 09:34 PM

intelligentcomputer vision

In 2009, Li Feifei, a computer scientist working at Princeton University at the time, led the construction of a data set that changed the history of artificial intelligence—ImageNet. It contains millions of labeled images that can be used to train complex machine learning models to recognize objects in images.

In 2015, machines’ recognition capabilities surpassed those of humans. Li Feifei soon turned to a new goal, to find what she called another "North Star" (the "Polaris" here refers to the key scientific problem that researchers focus on solving, which can inspire their Research enthusiasm and make breakthrough progress).

Where has the 'embodied intelligence” that Li Feifei focused on reached?

She found inspiration by looking back 530 million years to the Cambrian explosion of life, when many land animal species first appeared. One influential theory suggests that the explosion of new species was driven in part by the emergence of eyes, which allowed creatures to see the world around them for the first time. Li Feifei believes that animal vision does not arise in isolation, but is "deeply embedded in a whole that needs to move, navigate, survive, manipulate and change in a rapidly changing environment," she said, "so I It is natural to turn to a more active AI field."

Where has the 'embodied intelligence” that Li Feifei focused on reached?

Today, Li Feifei's work focuses on AI agents, which can not only receive data from A set of static images can also be moved around in a simulated environment of a three-dimensional virtual world and interact with the surrounding environment.

This is the broad goal of a new field called “embodied AI.” It overlaps with robotics in that robots can be viewed as the physical equivalent of embodied AI agents and reinforcement learning in the real world. Li Feifei and others believe that embodied AI may bring us a major transformation, from the simple ability of machine learning such as recognizing images, to learning how to perform complex human-like tasks through multiple steps, such as making frying pans. Egg rolls.

Today, the work of embodied AI includes any agent that can detect and modify its own environment. In robotics, AI agents always live in robot bodies, while agents in real simulations may have a virtual body, or may perceive the world through a moving camera position and interact with the surrounding environment . "The meaning of embodiment is not the body itself, but the overall needs and functions of interacting with the environment and doing things in the environment," Li Feifei explained.

This interactivity gives agents a new—and in many cases, better—way to understand the world. This is equivalent to the fact that before you were just observing the possible relationship between two objects, but now you can experiment and make this relationship happen yourself. With this new understanding, ideas are put into practice and greater wisdom follows. With a new set of virtual worlds up and running, embodied AI agents have begun to realize this potential, making significant progress in their new environments.

"Right now, we don't have any evidence for the existence of intelligence that doesn't learn by interacting with the world," said Viviane Clay, an embodied AI researcher at the University of Osnebruck in Germany.

Towards Perfect Simulation

Although researchers have long wanted to create real virtual worlds for AI agents to explore, they have only been created for about five years. This capability comes from improvements in graphics in the film and video game industries. In 2017, AI agents can depict interior spaces as realistically as if they were in a home—a virtual, but literal “home.” Computer scientists at the Allen Institute for Artificial Intelligence built a simulator called AI2-Thor that lets agents roam around natural kitchens, bathrooms, living rooms, and bedrooms. Agents can learn three-dimensional views that change as they move, with the simulator showing new angles when they decide to take a closer look.

This new world also gives the agent an opportunity to think about changes in a new dimension "time". "That's a big change," said Manolis Savva, a computer graphics researcher at Simon Fraser University. "In an embodied AI setting, you have these temporally coherent streams of information that you can control."

These simulated worlds are now good enough to train agents to complete completely new tasks. Not only can they recognize an object, they can interact with it, pick it up and navigate around it. These seemingly small steps are necessary for any agent to understand its environment. In 2020, virtual agents have the ability to go beyond vision and hear the sounds made by virtual things, providing a new perspective on understanding objects and how they operate in the world.

Where has the 'embodied intelligence” that Li Feifei focused on reached?

Embodied AI agents that can run in a virtual world (ManipulaTHOR environment) learn in a different way and may be more suitable for more complex, human-like Task.

However, the simulator also has its own limitations. “Even the best simulators are far less realistic than the real world,” says Daniel Yamins, a computer scientist at Stanford University. Yamins co-developed ThreeDWorld with colleagues at MIT and IBM, a project focused on simulating real-life physics in virtual worlds, such as the behavior of liquids and how some objects are rigid in one area and rigid in another. The area is flexible again.

This is a very challenging task that requires AI to learn in new ways.

Comparison with Neural Networks

So far, a simple way to measure the progress of embodied AI is to compare the performance of embodied agents with those trained on simpler static image tasks. algorithms for comparison. The researchers note that these comparisons are not perfect, but early results do suggest that embodied AI learns differently and sometimes better than their predecessors.

In a recent paper ("Interactron: Embodied Adaptive Object Detection"), researchers found that an embodied AI agent was more accurate at detecting specific objects, nearly 12% better than traditional methods . "It took more than three years for the object detection field to achieve this level of improvement," said study co-author Roozbeh Mottaghi, a computer scientist at the Allen Institute for Artificial Intelligence. "And we've achieved so much just by interacting with the world. Progress."

Other papers have shown that when you take the form of an embodied AI and have them explore a virtual space or walk around collecting multiple views of an object, the algorithm Make progress.

The researchers also found that embodied algorithms and traditional algorithms learn completely differently. To demonstrate this, consider neural networks, the fundamental ingredient behind the learning capabilities of every embodied algorithm and many disembodied algorithms. Neural networks are made up of many layers of artificial neuron nodes connected and are loosely modeled after the networks in the human brain. In two separate papers, researchers found that in neural networks of embodied agents, fewer neurons respond to visual information, meaning each individual neuron is more selective in how it responds . Disembodied networks are much less efficient, requiring more neurons to remain active most of the time. One research team (led by incoming NYU professor Grace Lindsay) even compared embodied and non-embodied neural networks with neuronal activity in a living brain (the visual cortex of mice) and found that embodied neural networks The Internet is the closest thing to a living body.

Lindsay is quick to point out that this doesn’t necessarily mean the embodied versions are better, they’re just different. Unlike the object detection paper, Lindsay et al.'s study compares the potential differences of the same neural network, allowing the agents to complete completely different tasks, so they may need neural networks that work differently to accomplish their goals.

While comparing embodied neural networks to disembodied neural networks is one way to measure improvement, what researchers really want to do is not improve the performance of embodied agents on existing tasks. , their real goal is to learn more complex, more human-like tasks. This is what excites researchers the most, and they're seeing impressive progress, especially on navigation tasks. In these tasks, the agent must remember the long-term goals of its destination while formulating a plan to get there without getting lost or bumping into objects.

In just a few years, a team led by Dhruv Batra, a research director at Meta AI and a computer scientist at the Georgia Institute of Technology, worked on a specific navigation task called "point-goal navigation." A lot of progress has been made. In this task, the agent is placed in a completely new environment and must go to a certain coordinate (such as "Go to the point that is 5 meters north and 10 meters east") without a map.

Batra said that they trained the agent in a Meta virtual world called "AI Habitat" and gave it a GPS and a compass. They found that it could obtain 99.9% on the standard data set. the above accuracy. More recently, they have successfully extended their results to a more difficult and realistic scenario - without a compass or GPS. As a result, the agent achieved 94% accuracy in estimating its position using only the stream of pixels it saw while moving.

Where has the 'embodied intelligence” that Li Feifei focused on reached?

Meta AI The "AI Habitat" virtual world created by the Dhruv Batra team. They hope to increase the speed of simulations until embodied AI can achieve 20 years of simulation experience in just 20 minutes of wall-clock time.

Mottaghi said, "This is a great improvement, but it does not mean that the navigation problem is completely solved. Because many other types of navigation tasks require the use of more complex language instructions, such as "passing the kitchen" Go get the glasses on the bedside table in your bedroom," and the accuracy is still only about 30% to 40%.

But navigation remains one of the simplest tasks in embodied AI, since the agent does not need to manipulate anything as it moves through the environment. So far, embodied AI agents are far from mastering any object-related tasks. Part of the challenge is that when an agent interacts with new objects, it can make many errors, and the errors can pile up. Currently, most researchers address this problem by choosing tasks with only a few steps, but most human-like activities, such as baking or washing dishes, require long sequences of actions on multiple objects. To achieve this goal, AI agents will need to make even greater advances.

In this regard, Fei-Fei Li may be at the forefront again, as her team developed a simulated dataset, BEHAVIOR, that it hopes will do for embodied AI what her ImageNet project did for object recognition. Make a contribution.

Where has the 'embodied intelligence” that Li Feifei focused on reached?

This data set contains more than 100 human activities for agents to complete, and the test can be completed in any virtual environment. Fei-Fei Li's team's new dataset will allow the community to better assess the progress of virtual AI agents by creating metrics that compare agents performing these tasks to real videos of humans performing the same tasks.

Once the agent successfully completes these complex tasks, Li Feifei believes that the purpose of simulation is to train for the final operable space-the real world.

"In my opinion, simulation is one of the most important and exciting areas in robotics research." Li Feifei said.

New Frontier of Robot Research

Robots are essentially embodied intelligence. They inhabit some kind of physical body in the real world and represent the most extreme form of embodied AI agent. But many researchers have found that even such agents can benefit from training in virtual worlds.

The most advanced algorithms in robotics, such as reinforcement learning, often require millions of iterations to learn something meaningful, Mottaghi said. Therefore, training real robots to perform difficult tasks can take years.

Where has the 'embodied intelligence” that Li Feifei focused on reached?

#Robots can navigate uncertain terrain in the real world. New research shows that training in virtual environments can help robots master these and other skills.

But if you train them in the virtual world first, the speed will be much faster. Thousands of agents can be trained simultaneously in thousands of different rooms. Additionally, virtual training is safer for both robots and humans.

In 2018, OpenAI researchers demonstrated that skills learned by an agent in the virtual world can be transferred to the real world, so many robotics experts began to pay more attention to simulators. They trained a robotic hand to manipulate a cube that had only been seen in simulations. Recent research also includes enabling drones to learn to avoid collisions in the air, deploying self-driving cars in urban environments on two different continents, and enabling a four-legged robot dog to complete a one-hour hike in the Swiss Alps (and It takes the same amount of time as humans).

In the future, researchers may also send humans into virtual space through virtual reality headsets, thus bridging the gap between simulation and the real world. Dieter Fox, senior director of robotics research at Nvidia and a professor at the University of Washington, pointed out that a key goal of robotics research is to build robots that are helpful to humans in the real world. But to do this, they must first be exposed to and learn how to interact with humans.

Using virtual reality technology to put humans into these simulated environments and then have them make presentations and interact with robots would be a very powerful approach, Fox said.

Whether they are in a simulation or the real world, embodied AI agents are learning to be more like humans and complete tasks that are more like human tasks. The field is advancing in all aspects, including new worlds, new tasks, and new learning algorithms.

“I see the fusion of deep learning, robot learning, vision and even language,” Li Feifei said. “Now I think that through this ‘moonshot’ or ‘North Star’ for embodied AI, we will Learning the basic technologies of intelligence can truly lead to major breakthroughs."

Where has the 'embodied intelligence” that Li Feifei focused on reached?

Li Feifei's article discusses the "Polaris" issue of computer vision. Link: https://www.amacad.org/publication/searching-computer-vision-north-stars

The above is the detailed content of Where has the 'embodied intelligence” that Li Feifei focused on reached?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Can't use ChatGPT! Explaining the causes and solutions that can be tested immediately [Latest 2025]May 14, 2025 am 05:04 AM

ChatGPT is not accessible? This article provides a variety of practical solutions! Many users may encounter problems such as inaccessibility or slow response when using ChatGPT on a daily basis. This article will guide you to solve these problems step by step based on different situations. Causes of ChatGPT's inaccessibility and preliminary troubleshooting First, we need to determine whether the problem lies in the OpenAI server side, or the user's own network or device problems. Please follow the steps below to troubleshoot: Step 1: Check the official status of OpenAI Visit the OpenAI Status page (status.openai.com) to see if the ChatGPT service is running normally. If a red or yellow alarm is displayed, it means Open

Calculating The Risk Of ASI Starts With Human MindsMay 14, 2025 am 05:02 AM

On 10 May 2025, MIT physicist Max Tegmark told The Guardian that AI labs should emulate Oppenheimer’s Trinity-test calculus before releasing Artificial Super-Intelligence. “My assessment is that the 'Compton constant', the probability that a race to

An easy-to-understand explanation of how to write and compose lyrics and recommended tools in ChatGPTMay 14, 2025 am 05:01 AM

AI music creation technology is changing with each passing day. This article will use AI models such as ChatGPT as an example to explain in detail how to use AI to assist music creation, and explain it with actual cases. We will introduce how to create music through SunoAI, AI jukebox on Hugging Face, and Python's Music21 library. Through these technologies, everyone can easily create original music. However, it should be noted that the copyright issue of AI-generated content cannot be ignored, and you must be cautious when using it. Let’s explore the infinite possibilities of AI in the music field together! OpenAI's latest AI agent "OpenAI Deep Research" introduces: [ChatGPT]Ope

What is ChatGPT-4? A thorough explanation of what you can do, the pricing, and the differences from GPT-3.5!May 14, 2025 am 05:00 AM

The emergence of ChatGPT-4 has greatly expanded the possibility of AI applications. Compared with GPT-3.5, ChatGPT-4 has significantly improved. It has powerful context comprehension capabilities and can also recognize and generate images. It is a universal AI assistant. It has shown great potential in many fields such as improving business efficiency and assisting creation. However, at the same time, we must also pay attention to the precautions in its use. This article will explain the characteristics of ChatGPT-4 in detail and introduce effective usage methods for different scenarios. The article contains skills to make full use of the latest AI technologies, please refer to it. OpenAI's latest AI agent, please click the link below for details of "OpenAI Deep Research"

Explaining how to use the ChatGPT app! Japanese support and voice conversation functionMay 14, 2025 am 04:59 AM

ChatGPT App: Unleash your creativity with the AI assistant! Beginner's Guide The ChatGPT app is an innovative AI assistant that handles a wide range of tasks, including writing, translation, and question answering. It is a tool with endless possibilities that is useful for creative activities and information gathering. In this article, we will explain in an easy-to-understand way for beginners, from how to install the ChatGPT smartphone app, to the features unique to apps such as voice input functions and plugins, as well as the points to keep in mind when using the app. We'll also be taking a closer look at plugin restrictions and device-to-device configuration synchronization

How do I use the Chinese version of ChatGPT? Explanation of registration procedures and feesMay 14, 2025 am 04:56 AM

ChatGPT Chinese version: Unlock new experience of Chinese AI dialogue ChatGPT is popular all over the world, did you know it also offers a Chinese version? This powerful AI tool not only supports daily conversations, but also handles professional content and is compatible with Simplified and Traditional Chinese. Whether it is a user in China or a friend who is learning Chinese, you can benefit from it. This article will introduce in detail how to use ChatGPT Chinese version, including account settings, Chinese prompt word input, filter use, and selection of different packages, and analyze potential risks and response strategies. In addition, we will also compare ChatGPT Chinese version with other Chinese AI tools to help you better understand its advantages and application scenarios. OpenAI's latest AI intelligence

5 AI Agent Myths You Need To Stop Believing NowMay 14, 2025 am 04:54 AM

These can be thought of as the next leap forward in the field of generative AI, which gave us ChatGPT and other large-language-model chatbots. Rather than simply answering questions or generating information, they can take action on our behalf, inter

An easy-to-understand explanation of the illegality of creating and managing multiple accounts using ChatGPTMay 14, 2025 am 04:50 AM

Efficient multiple account management techniques using ChatGPT | A thorough explanation of how to use business and private life! ChatGPT is used in a variety of situations, but some people may be worried about managing multiple accounts. This article will explain in detail how to create multiple accounts for ChatGPT, what to do when using it, and how to operate it safely and efficiently. We also cover important points such as the difference in business and private use, and complying with OpenAI's terms of use, and provide a guide to help you safely utilize multiple accounts. OpenAI

See all articles