search
HomeTechnology peripheralsAIHow to solve the long tail problem in autonomous driving scenarios?

Yesterday during the interview, I was asked whether I had done any long-tail related questions, so I wanted to briefly summarize it.

The long-tail problem of self-driving cars refers to edge situations in self-driving cars, that is, possible scenarios with low probability of occurrence. The perceived long-tail problem is one of the main reasons currently limiting the operational design domain of single-vehicle intelligent autonomous vehicles. The underlying architecture and most technical issues of autonomous driving have been solved, and the remaining 5% of long-tail problems have gradually become the key to restricting the development of autonomous driving. These problems include a variety of fragmented scenarios, extreme situations, and unpredictable human behavior.

Edge Scenarios in Autonomous Driving

The "long tail" refers to the edge situations in autonomous vehicles (AV), which are more likely to occur. Low possible scenario. These rare events are often missed in data sets because they occur less frequently and are more unique. While humans are naturally good at handling edge cases, the same cannot be said for AI. Factors that may cause edge scenes include: trucks or special-shaped vehicles with protrusions, vehicles making sharp turns, driving in crowded crowds, pedestrians jaywalking, extreme weather or poor lighting conditions, people holding umbrellas, people in cars Then moving boxes, trees falling in the middle of the road, etc.

Example:

    Put a transparent film in front of the car, will the transparent object be recognized, and will the vehicle slow down?
  1. Lidar company Aeye has done a challenge, how does autonomous driving deal with a balloon floating in the middle of the road. L4 driverless cars tend to avoid collisions. In this case, they will take evasive actions or apply the brakes to avoid unnecessary accidents. The balloon is a soft object and can pass directly without any obstacles.

Methods to solve the long tail problem

Synthetic data is a big concept, and sensory data (nerf, camera/sensor sim) is just one of the more outstanding ones branch. In the industry, synthetic data has long become the standard answer in longtail behavior sim. Synthetic data, or sparse signal upsampling, is one of the first solutions to the long-tail problem. Long-tail capability is the product of the model’s generalization capability and the amount of information contained in the data.

Tesla solution:

Use synthetic data (synthetic data) to generate edge scenes to expand the data setData engine Principle: First, inaccuracies in existing models are detected and subsequently such cases are added to their unit tests. It also collects more data on similar cases to retrain the model. This iterative approach allows it to capture as many edge cases as possible. The main challenge in creating edge cases is that the cost of collecting and labeling edge cases is relatively high, and the other is that the collection behavior may be very dangerous or even impossible to achieve.

NVIDIA Solution:

NVIDIA recently proposed a strategic approach called "imitation training" (picture below). In this approach, real-world system failure cases are recreated in a simulated environment and then used as training data for autonomous vehicles. This cycle is repeated until the model's performance converges. The goal of this approach is to improve the robustness of the autonomous driving system by continuously simulating fault scenarios. Simulation training allows developers to better understand and resolve different failure scenarios in the real world. In addition, it can quickly generate large amounts of training data to improve model performance. By repeating this cycle,

How to solve the long tail problem in autonomous driving scenarios?

In the following actual scene, due to the truck being too high (top) and the protruding part of the vehicle blocking the rear vehicle (bottom), the model outputs a vehicle frame. Lost, it becomes an edge scene, and NVIDIA’s improved model can generate the correct bounding box in this edge case.

How to solve the long tail problem in autonomous driving scenarios?

Some thoughts:

Q: Is synthetic data valuable?

A: The value here is divided into two types. The first is test effectiveness, that is, testing whether some deficiencies in the detection algorithm can be found in the generated scenario. The second is training effectiveness, that is, Whether the generated scenarios can effectively improve performance when used for algorithm training.

Q: How to use virtual data to improve performance? Is it really necessary to add dummy data to the training set? Will adding it cause a performance regression?

A: These questions are difficult to answer, so many different solutions to improve training accuracy have been produced:

  • Hybrid training: Add different proportions of virtual data to real data to improve performance.
  • Transfer Learning: Use real data to pre-train the model, then Freeze certain layers, and then Add mixed data for training.
  • Imitation Learning: It is very natural to design some scenarios of model errors and generate some data thereby gradually improving the performance of the model. In actual data collection and model training, some supplementary data are also collected in a targeted manner to improve performance.

Some extensions:

To thoroughly evaluate the robustness of an AI system, unit tests must include both general and edge cases. However, some edge cases may not be available from existing real-world datasets. To do this, AI practitioners can use synthetic data for testing.

One example is ParallelEye-CS, a synthetic dataset used to test the visual intelligence of self-driving cars. The benefit of creating synthetic data compared to using real-world data is the multi-dimensional control over the scene for each image.

Synthetic data will serve as a viable solution for edge cases in production AV models. It supplements real-world data sets with edge cases, ensuring that AV remains robust even under unusual events. It's also more scalable, less error-prone, and cheaper than real-world data.

The above is the detailed content of How to solve the long tail problem in autonomous driving scenarios?. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How to Run LLM Locally Using LM Studio? - Analytics VidhyaHow to Run LLM Locally Using LM Studio? - Analytics VidhyaApr 19, 2025 am 11:38 AM

Running large language models at home with ease: LM Studio User Guide In recent years, advances in software and hardware have made it possible to run large language models (LLMs) on personal computers. LM Studio is an excellent tool to make this process easy and convenient. This article will dive into how to run LLM locally using LM Studio, covering key steps, potential challenges, and the benefits of having LLM locally. Whether you are a tech enthusiast or are curious about the latest AI technologies, this guide will provide valuable insights and practical tips. Let's get started! Overview Understand the basic requirements for running LLM locally. Set up LM Studi on your computer

Guy Peri Helps Flavor McCormick's Future Through Data TransformationGuy Peri Helps Flavor McCormick's Future Through Data TransformationApr 19, 2025 am 11:35 AM

Guy Peri is McCormick’s Chief Information and Digital Officer. Though only seven months into his role, Peri is rapidly advancing a comprehensive transformation of the company’s digital capabilities. His career-long focus on data and analytics informs

What is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaWhat is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaApr 19, 2025 am 11:33 AM

Introduction Artificial intelligence (AI) is evolving to understand not just words, but also emotions, responding with a human touch. This sophisticated interaction is crucial in the rapidly advancing field of AI and natural language processing. Th

12 Best AI Tools for Data Science Workflow - Analytics Vidhya12 Best AI Tools for Data Science Workflow - Analytics VidhyaApr 19, 2025 am 11:31 AM

Introduction In today's data-centric world, leveraging advanced AI technologies is crucial for businesses seeking a competitive edge and enhanced efficiency. A range of powerful tools empowers data scientists, analysts, and developers to build, depl

AV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsAV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsApr 19, 2025 am 11:30 AM

This week's AI landscape exploded with groundbreaking releases from industry giants like OpenAI, Mistral AI, NVIDIA, DeepSeek, and Hugging Face. These new models promise increased power, affordability, and accessibility, fueled by advancements in tr

Perplexity's Android App Is Infested With Security Flaws, Report FindsPerplexity's Android App Is Infested With Security Flaws, Report FindsApr 19, 2025 am 11:24 AM

But the company’s Android app, which offers not only search capabilities but also acts as an AI assistant, is riddled with a host of security issues that could expose its users to data theft, account takeovers and impersonation attacks from malicious

Everyone's Getting Better At Using AI: Thoughts On Vibe CodingEveryone's Getting Better At Using AI: Thoughts On Vibe CodingApr 19, 2025 am 11:17 AM

You can look at what’s happening in conferences and at trade shows. You can ask engineers what they’re doing, or consult with a CEO. Everywhere you look, things are changing at breakneck speed. Engineers, and Non-Engineers What’s the difference be

Rocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaRocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaApr 19, 2025 am 11:12 AM

Simulate Rocket Launches with RocketPy: A Comprehensive Guide This article guides you through simulating high-power rocket launches using RocketPy, a powerful Python library. We'll cover everything from defining rocket components to analyzing simula

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!