search
HomeTechnology peripheralsAIThe 'secret' of robot implementation: continuous learning, knowledge transfer and autonomous participation

The secret of robot implementation: continuous learning, knowledge transfer and autonomous participation

This article is reproduced from Lei Feng.com. If you need to reprint, please go to the official website of Lei Feng.com to apply for authorization.

On May 23, 2022, ICRA 2022 (IEEE International Conference on Robotics and Automation), the annual top international conference in the field of robotics, was held as scheduled in Philadelphia, USA.

This is the 39th year of ICRA. ICRA is the flagship conference of the IEEE Robotics and Automation Society and the primary international forum for robotics researchers to present and discuss their work.

At this year’s ICRA, three of Amazon’s chief robotics experts, Sidd Srinivasa, Tye Brady and Philipp Michel, briefly discussed the challenges of building robotic systems for human-machine interaction in the real world.

The secret of robot implementation: continuous learning, knowledge transfer and autonomous participation

Note: From left to right are Sidd Srinivasa, director of artificial intelligence for Amazon Robotics, Tye Brady, chief technical expert of Amazon Robotics (Global), and senior manager of applied science at Amazon Scout Philipp MichelSidd

Srinivasa is a world-renowned robotics expert, IEEE Fellow, currently a Distinguished Professor at Boeing at the University of Washington, and the leader of the Amazon Robot Artificial Intelligence Project. He is responsible for managing the algorithms of autonomous robots that assist Amazon logistics center employees. Research into robots that can pack and package products and cart-style robots that can autonomously lift, unload, and transport goods.

Tye Brady is the chief technical expert of Amazon Robotics (global) and has a master's degree in aerospace engineering from MIT. Philipp Michel and Sidd Srinivasa are both doctoral alumni of the CMU Robotics Institute and are senior managers of Amazon’s Scout robot project.

They put forward their own views on solving the challenges of robot landing. The AI ​​technology review has been compiled without changing its original meaning, as follows:

Q: Your research in the field of robotics solves different problems. What are the similarities between these problems?

Sidd Srinivasa: An important difficulty in robotics research is that we live in an open world. We don’t even know what the “input” is about to face. In our fulfillment center, I have over 20 million items to control, and the number of items is increasing by the hundreds every day. Most of the time, our robots don't know what the items they are picking up are, but they need to pick them up carefully and package them quickly without damaging them.

Philipp Michel: For Scout, the difficulty is the objects encountered on the sidewalk, and the environment of transportation. We have private delivery facilities deployed in four states across the United States. Weather conditions, lighting conditions... it was clear from the beginning that we had to deal with a large number of variables to enable the robot to adapt to complex environments.

Tye Brady: In the process of developing execution robots, we have a significant advantage in that we operate in a semi-structured environment. We can make our own traffic rules for robots, and understanding the environment really helps our scientists and engineers gain a deep understanding of the objects we want to move, manipulate, classify, and identify to fulfill orders. In other words, we can realize the pursuit of technology in the real world.

Philipp Michel: Another thing we have in common is that we rely heavily on learning from data to solve problems. Scout receives real-world data as it performs tasks and then iteratively develops machine learning solutions for perception, localization, and navigation.

Sidd Srinivasa: I completely agree (learning to solve problems from data). I think machine learning and adaptive control are key to super-linear scaling. If we deploy thousands of robots, we can't have thousands of scientists and engineers working on them. We need to rely on real-world data to achieve super-linear growth.

In addition, I think the open world will force us to think about how to "continuous learning". Our machine learning models are often trained based on some input data distributions, but because this is an open world, we will encounter the problem of "covariate shift", that is, the data we see does not match the distribution. , which causes machine learning models to often be overconfident for no reason.

Therefore, a lot of the work we do is to create "watchdogs" (watchdogs, a supervisory device) to identify when the input data distribution deviates from the distribution it was trained on. Then, we perform "importance sampling" so that we can pick out the data that has changed and retrain the machine learning model.

Philipp Michel: This is one of the reasons why we want to train the robot in different places, so that we can know early on the real-life data that the robot may encounter, which in turn forces us to Develop solutions that address new data.

Sidd Srinivasa: This is indeed a good idea. One of the advantages of having multiple robots is the system's ability to recognize changed content, retrain, and then share this knowledge with other robots.

Think of a story about a sorting robot: In a corner of the world, a robot encounters a new packaging type. At first, it was troubled because it had never seen anything like this before and couldn't recognize it. Then a new solution emerged: a robot that could transmit new packaging types to all the robots in the world. That way, when this new packaging type appears elsewhere, the other robots will know what to do with it. It is equivalent to having a "backup". When new data appears at one point, other points will know it, because the system has been able to retrain itself and share information.

Philipp Michel: Our robot is doing similar things. If our robots encounter new obstacles that they haven't encountered before, we try to adjust the model to recognize and deal with these obstacles, and then deploy the new model to all robots.

One of the things that keeps me up at night is the idea that our robots will encounter new objects on the sidewalk that they won’t encounter again for the next three years, such as: People on the sidewalk Gargoyles used to decorate lawns for Halloween, or people place an umbrella on a picnic table to make it look less like a "picnic table." In this case, all machine learning algorithms fail to recognize that this is a picnic table.

So part of our research is about how to balance common things that don’t need to be entangled with specific categories of things. If this is an open manhole cover, the robot must be good at identifying it, otherwise it will fall. But if it's just a random box, we probably don't need to know the hierarchy of the box, just that this is the object we want to walk around.

Sidd Srinivasa: Another challenge is that when you change your model, there may be unintended consequences. The changed model may not affect the robot's perception, but it may change the way the robot "brakes", causing the ball bearings to wear out after two months. In end-to-end systems, a lot of interesting future research is about "understanding the impact of changes in parts of the system on the performance of the entire system."

Philipp Michel: We spent a lot of time thinking about whether we should divide the different parts of the robot stack. Integration between them can bring many benefits, but it is also limited. One extreme case is camera-to-motor-to-torque learning, which is very challenging in any real-world robotics application. Then there's the traditional robotics stack, which is nicely divided into parts like localization, perception, planning, and control.

We also spent a lot of time thinking about how the stack should evolve over time, and what performance improvements are there when bringing these pieces closer together? At the same time, we want to have a system that remains as interpretable as possible. We attempt to maximize the integration of learning components leveraging the entire stack while preserving interpretability and the number of safety features.

Sidd Srinivasa: This is a great point. I completely agree with Philipp’s point of view. It may not be correct to use one model to rule all models. But often, we end up building machine learning models that share a backbone with multiple applied heads. What is an object and what does it mean to segment an object? It might be something like picking, stacking, or packing, but each requires a specialized head, riding on a backbone that specializes in tasks.

Philipp Michel: Some of the factors we consider are battery, range, temperature, space and computing constraints. So we need to be efficient with our models, optimize the models, and try to take advantage of the shared backbone as much as possible, like Sidd mentioned, different heads for different tasks.

The secret of robot implementation: continuous learning, knowledge transfer and autonomous participation

Caption: Amazon Scout is an autonomous delivery robot that can move at walking speed on public sidewalks and is currently undergoing field testing in four states in the United States.

Q: When I asked about the commonalities between your projects, one thing that came to mind is that your robots all work in the same environment as humans. Why does this complicate the issue?

Sidd Srinivasa: Robots are approaching human life, and we must respect all the complex interactions that occur in the human world. In addition to walking, driving, and performing tasks, there are also complex social interactions. What’s important for a robot is, first, to be conscious and, second, to be involved.

It's really hard, when you're driving, sometimes it's hard to tell what other people are thinking and to decide how to act based on what they're thinking. Just reasoning about the problem is hard, and then closing the loop is even harder.

If a robot is playing chess or playing against a human, it's much easier to predict what they're going to do because the rules are already well laid out. If you assume your opponents are optimal, you will do well even if they are suboptimal. This is guaranteed in some two-player games.

But the actual situation is not like this. When we play this kind of cooperative game that ensures a win-win situation, we find that it is actually difficult to predict accurately during the game, even if the collaborators have good intentions.

Philipp Michel: And the behavior of the human world changes greatly. Some pets completely ignore the robot, and some pets will walk towards the robot. The same goes for pedestrians, with some turning a blind eye to the robot and others walking right up to it. Children, in particular, are extremely curious and highly interactive. We need to be able to handle all situations safely, and these variability are exciting. ​

The above is the detailed content of The 'secret' of robot implementation: continuous learning, knowledge transfer and autonomous participation. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
What is Graph of Thought in Prompt EngineeringWhat is Graph of Thought in Prompt EngineeringApr 13, 2025 am 11:53 AM

Introduction In prompt engineering, “Graph of Thought” refers to a novel approach that uses graph theory to structure and guide AI’s reasoning process. Unlike traditional methods, which often involve linear s

Optimize Your Organisation's Email Marketing with GenAI AgentsOptimize Your Organisation's Email Marketing with GenAI AgentsApr 13, 2025 am 11:44 AM

Introduction Congratulations! You run a successful business. Through your web pages, social media campaigns, webinars, conferences, free resources, and other sources, you collect 5000 email IDs daily. The next obvious step is

Real-Time App Performance Monitoring with Apache PinotReal-Time App Performance Monitoring with Apache PinotApr 13, 2025 am 11:40 AM

Introduction In today’s fast-paced software development environment, ensuring optimal application performance is crucial. Monitoring real-time metrics such as response times, error rates, and resource utilization can help main

ChatGPT Hits 1 Billion Users? 'Doubled In Just Weeks' Says OpenAI CEOChatGPT Hits 1 Billion Users? 'Doubled In Just Weeks' Says OpenAI CEOApr 13, 2025 am 11:23 AM

“How many users do you have?” he prodded. “I think the last time we said was 500 million weekly actives, and it is growing very rapidly,” replied Altman. “You told me that it like doubled in just a few weeks,” Anderson continued. “I said that priv

Pixtral-12B: Mistral AI's First Multimodal Model - Analytics VidhyaPixtral-12B: Mistral AI's First Multimodal Model - Analytics VidhyaApr 13, 2025 am 11:20 AM

Introduction Mistral has released its very first multimodal model, namely the Pixtral-12B-2409. This model is built upon Mistral’s 12 Billion parameter, Nemo 12B. What sets this model apart? It can now take both images and tex

Agentic Frameworks for Generative AI Applications - Analytics VidhyaAgentic Frameworks for Generative AI Applications - Analytics VidhyaApr 13, 2025 am 11:13 AM

Imagine having an AI-powered assistant that not only responds to your queries but also autonomously gathers information, executes tasks, and even handles multiple types of data—text, images, and code. Sounds futuristic? In this a

Applications of Generative AI in the Financial SectorApplications of Generative AI in the Financial SectorApr 13, 2025 am 11:12 AM

Introduction The finance industry is the cornerstone of any country’s development, as it drives economic growth by facilitating efficient transactions and credit availability. The ease with which transactions occur and credit

Guide to Online Learning and Passive-Aggressive AlgorithmsGuide to Online Learning and Passive-Aggressive AlgorithmsApr 13, 2025 am 11:09 AM

Introduction Data is being generated at an unprecedented rate from sources such as social media, financial transactions, and e-commerce platforms. Handling this continuous stream of information is a challenge, but it offers an

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.