search
HomeTechnology peripheralsAIIt took Google two years to build 23 robots using reinforcement learning to help sort garbage

Reinforcement learning (RL) allows robots to interact through trial and error to learn complex behaviors and become better and better over time. Some previous work at Google has explored how RL can enable robots to master complex skills such as grasping, multi-task learning, and even playing table tennis. Although reinforcement learning in robots has made great progress, we still do not see robots with reinforcement learning in daily environments. Because the real world is complex, diverse, and constantly changing over time, this poses huge challenges to robotic systems. However, reinforcement learning should be an excellent tool for addressing these challenges: by practicing, improving, and learning on the job, robots should be able to adapt to an ever-changing world.

In the Google paper "Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators," researchers explore how to solve this problem through the latest large-scale experiments , they deployed a fleet of 23 RL-enabled robots over two years to sort and recycle trash in Google office buildings. The robotic system used combines scalable deep reinforcement learning from real-world data with guided and auxiliary object-aware input from simulation training to improve generalization while retaining end-to-end training advantages. 4800 evaluation trials to verify.

It took Google two years to build 23 robots using reinforcement learning to help sort garbage

Paper address: https://rl-at-scale.github.io/assets/rl_at_scale .pdf

##Problem Setting

If people do not sort their waste properly, batches of recyclables may become contaminated and compost may be improperly discarded into landfill. In Google's experiment, robots roamed around office buildings looking for "dumpsters" (recycled bins, compost bins and other waste bins). The robot's task is to arrive at each garbage station to sort waste, transport items between different bins in order to place all recyclable items (cans, bottles) into recyclable bins and all compostable items (cardboard containers, paper cups ) into the compost bin and everything else in the other bins.

Actually this task is not as easy as it seems. Just the sub-task of picking up the different items that people throw in the trash is already a huge challenge. The robot must also identify the appropriate bin for each object and sort them as quickly and efficiently as possible. In the real world, robots encounter a variety of unique situations, such as the following real office building examples:

Learning from Different Experiences

On the job Continuous learning helps, but before you get to that point, you need to guide the robot with a basic set of skills. To this end, Google uses four sources of experience: (1) simple hand-designed strategies, which have a low success rate but help provide initial experience; (2) a simulation training framework that uses simulation-to-real transfer to provide some preliminary experience. Garbage sorting strategies; (3) "robot classrooms", where robots use representative garbage stations to practice continuously; (4) real deployment environments, where robots practice in office buildings with real garbage.

It took Google two years to build 23 robots using reinforcement learning to help sort garbage

Schematic diagram of reinforcement learning in this large-scale application. Use script-generated data to guide the launch of the policy (top left). A simulation-to-real model is then trained, generating additional data in the simulation environment (top right). During each deployment cycle, add data collected in “robot classrooms” (bottom right). Deploying and collecting data in an office building (bottom left).

The reinforcement learning framework used here is based on QT-Opt, which is also used to capture different garbage in the laboratory environment and a series of other skills. Start with a simple scripting strategy to guide you in a simulation environment, apply reinforcement learning, and use CycleGAN-based transfer methods to make simulation images look more realistic using RetinaGAN.

This is where we begin to enter “robot classrooms”. While actual office buildings provide the most realistic experience, data collection throughput is limited—some days there will be a lot of trash to sort, other days not so much. Robots have accumulated most of their experience in “robot classrooms.” In the “robot classrooms” shown below, there are 20 robots practicing garbage sorting tasks:

It took Google two years to build 23 robots using reinforcement learning to help sort garbage

When these robots are trained in “robot classrooms” At the same time, other robots were learning at the same time on 30 garbage bins in 3 office buildings.

Classification Performance

In the end, the researchers collected 540,000 experimental data from "robot classrooms" and 325,000 experimental data in the actual deployment environment. As data continues to increase, the performance of the entire system improves. The researchers evaluated the final system in “robot classrooms” to allow for controlled comparisons, setting up scenarios based on what the robots would see in actual deployments. The final system achieved an average accuracy of about 84%, with performance improving steadily as data was added. In the real world, researchers documented statistics from actual deployments in 2021 to 2022 and found that the system could reduce contaminants in bins by 40 to 50 percent by weight. In their paper, Google researchers provide deeper insights into the design of the technology, a study of the attenuation of various design decisions, and more detailed statistics from their experiments.

Conclusion and Future Work Outlook

The experimental results show that the reinforcement learning-based system can enable robots to handle practical tasks in real office environments. The combination of offline and online data enables robots to adapt to widely varying situations in the real world. At the same time, learning in a more controlled "classroom" environment, including in simulation environments and real environments, can provide a powerful starting mechanism that allows the "flywheel" of reinforcement learning to start turning, thereby achieving adaptability.

Although important results have been achieved, much work remains to be done: the final reinforcement learning strategy is not always successful, more powerful models are needed to improve their performance, and Expand this to a wider range of tasks. In addition, other sources of experience, including from other tasks, other robots, and even Internet videos, may further supplement the startup experience gained from simulation and "classroom". These are issues that need to be addressed in the future.

The above is the detailed content of It took Google two years to build 23 robots using reinforcement learning to help sort garbage. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Gemma Scope: Google's Microscope for Peering into AI's Thought ProcessGemma Scope: Google's Microscope for Peering into AI's Thought ProcessApr 17, 2025 am 11:55 AM

Exploring the Inner Workings of Language Models with Gemma Scope Understanding the complexities of AI language models is a significant challenge. Google's release of Gemma Scope, a comprehensive toolkit, offers researchers a powerful way to delve in

Who Is a Business Intelligence Analyst and How To Become One?Who Is a Business Intelligence Analyst and How To Become One?Apr 17, 2025 am 11:44 AM

Unlocking Business Success: A Guide to Becoming a Business Intelligence Analyst Imagine transforming raw data into actionable insights that drive organizational growth. This is the power of a Business Intelligence (BI) Analyst – a crucial role in gu

How to Add a Column in SQL? - Analytics VidhyaHow to Add a Column in SQL? - Analytics VidhyaApr 17, 2025 am 11:43 AM

SQL's ALTER TABLE Statement: Dynamically Adding Columns to Your Database In data management, SQL's adaptability is crucial. Need to adjust your database structure on the fly? The ALTER TABLE statement is your solution. This guide details adding colu

Business Analyst vs. Data AnalystBusiness Analyst vs. Data AnalystApr 17, 2025 am 11:38 AM

Introduction Imagine a bustling office where two professionals collaborate on a critical project. The business analyst focuses on the company's objectives, identifying areas for improvement, and ensuring strategic alignment with market trends. Simu

What are COUNT and COUNTA in Excel? - Analytics VidhyaWhat are COUNT and COUNTA in Excel? - Analytics VidhyaApr 17, 2025 am 11:34 AM

Excel data counting and analysis: detailed explanation of COUNT and COUNTA functions Accurate data counting and analysis are critical in Excel, especially when working with large data sets. Excel provides a variety of functions to achieve this, with the COUNT and COUNTA functions being key tools for counting the number of cells under different conditions. Although both functions are used to count cells, their design targets are targeted at different data types. Let's dig into the specific details of COUNT and COUNTA functions, highlight their unique features and differences, and learn how to apply them in data analysis. Overview of key points Understand COUNT and COU

Chrome is Here With AI: Experiencing Something New Everyday!!Chrome is Here With AI: Experiencing Something New Everyday!!Apr 17, 2025 am 11:29 AM

Google Chrome's AI Revolution: A Personalized and Efficient Browsing Experience Artificial Intelligence (AI) is rapidly transforming our daily lives, and Google Chrome is leading the charge in the web browsing arena. This article explores the exciti

AI's Human Side: Wellbeing And The Quadruple Bottom LineAI's Human Side: Wellbeing And The Quadruple Bottom LineApr 17, 2025 am 11:28 AM

Reimagining Impact: The Quadruple Bottom Line For too long, the conversation has been dominated by a narrow view of AI’s impact, primarily focused on the bottom line of profit. However, a more holistic approach recognizes the interconnectedness of bu

5 Game-Changing Quantum Computing Use Cases You Should Know About5 Game-Changing Quantum Computing Use Cases You Should Know AboutApr 17, 2025 am 11:24 AM

Things are moving steadily towards that point. The investment pouring into quantum service providers and startups shows that industry understands its significance. And a growing number of real-world use cases are emerging to demonstrate its value out

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment