search
HomeTechnology peripheralsAIData closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Zosi Auto R&D released "2022 China Autonomous Driving Data Closed-Loop Research Report".

1. The development of autonomous driving gradually shifts from technology-driven to data-driven

Nowadays, autonomous driving sensor solutions and computing platforms have become increasingly homogeneous, and suppliers The technological gap is narrowing day by day. In the past two years, autonomous driving technology iterations have advanced rapidly, and mass production has accelerated. According to Zuosi Data Center, in 2021, the cumulative number of domestic L2 assisted driving passenger vehicles will reach 4.79 million, a year-on-year increase of 58.0%. From January to June 2022, the penetration rate of China's L2 assisted driving in the new passenger car market climbed to 32.4%.

For autonomous driving, data runs through the entire life cycle of research and development, testing, mass production, operation and maintenance. With the rapid increase in the number of smart connected car sensors, the amount of data generated by ADAS and autonomous vehicles has also increased exponentially, from GB to TB, PB, EB, and even ZB in the future. Only by using data-driven car evolution to meet the personalized needs of users can car companies go far.

According to the "Safety Guidelines for Automobile Collection Data Processing", automobile collection data refers to the data collected by automobile sensing equipment and control units, as well as the data generated after processing, which can be detailed It is divided into off-vehicle data, cockpit data, operating data and location trajectory data, etc.

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

According to the "Several Regulations on Automobile Data Security Management (Trial)" promulgated by the Cyberspace Administration of China in August 2021, the collection, analysis, storage, transmission, query, and application of automobile data , deletion and other entire processes have been specified in detail. In the process of car data processing, we adhere to the data processing principles of "in-car processing", "no collection by default", "applicable accuracy range" and "desensitization processing" to reduce the disorderly collection and illegal abuse of car data. In the development process of autonomous driving technology, data collection and processing must first be legal and compliant.

Data collection/cleaning

From car cameras, millimeter wave radar, lidar The large amounts of unstructured data (images, videos, speech) collected by ultrasonic and ultrasonic radars can be raw and chaotic. To make data meaningful, it needs to be cleaned, structured, and organized. Data from multiple sources is first imported into the appropriate repository, the data format is standardized, and aggregated according to relevant rules. It then checks for corrupted, duplicate or missing data points and discards unnecessary data that may affect the overall quality of the data set. Finally, labels are used to classify videos captured under different conditions, such as day, night, sunny, rainy, etc. This step provides clean, structured data that will be used for training and validation.

Data annotation

The structured data that has been cleaned after data collection needs to be Label. Annotation is the process of assigning coded values ​​to raw data. Encoded values ​​include, but are not limited to, assigning class labels, drawing bounding boxes, and marking object boundaries. High-quality annotations are needed to teach supervised learning models what objects are and to measure the performance of trained models.

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

In the field of autonomous driving, the scenarios for data annotation processing usually include lane changing and overtaking, passing through intersections, unprotected left turns and right turns without traffic light control. Turns, and some complex long-tail scenes such as vehicles running red lights, pedestrians crossing the road, vehicles parked illegally on the roadside, etc.

Commonly used annotation tools include general image drawing, lane line annotation, driver face annotation, 3D point cloud annotation, 2D/3D fusion annotation, panoramic semantic segmentation, etc. Due to the development of big data and the increase in the number of large data sets, the use of data annotation tools continues to expand rapidly.

Data transmission

Nowadays, the frequency of data collection has entered the millisecond level, requiring It is high-precision data of thousands of signal dimensions (such as bus signals, sensor internal states, software buried points, user behavior and environment perception data, etc.), while avoiding data loss, disorder, jumps and delays, and at high Under the premise of high accuracy and high quality, transmission/storage costs are greatly reduced. The uplink and downlink of Internet of Vehicles data are relatively long (from the vehicle MCU, DCU, gateway, 4G/5G to the cloud) and it is necessary to ensure the data transmission quality of each link node.

In response to the new changes in data transmission, some companies have been able to provide efficient data collection and vehicle-cloud integrated transmission solutions, such as Zhixiehui and EXCEEDDATA flexible data acquisition platform solutions, which are based on real-time data in the vehicle edge computing environment. 10 millisecond-level real-time computing is used to trigger the flexible data collection and upload function. The uploaded data has been calculated and filtered, significantly reducing the amount of uploaded data. In addition, the original signal from the vehicle is compressed and stored 100-300 times losslessly. The cloud management platform saves the high-quality signal from the vehicle without loss and high compression ratio. It supports the issuance of data acquisition algorithms, the triggering of multiple acquisition modes, and the real-time upload of collected data. One-click download to the business desktop, multiple flexible filtering by vehicle, event, time period, etc., easy to use and solve, separation of storage and calculation, realizing a closed loop of vehicle-cloud isomorphic data collection-calculation-uploading-processing; In 2021, the first domestic mass-produced model equipped with the EXCEEDDATA solution has been launched (HiPhiX).

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Source: Zhixiehuitong

Data Storage

In order to perceive the surrounding environment more clearly, self-driving cars are equipped with more sensors and generate a large amount of data. Some high-level autonomous driving systems are even equipped with more than 40 various sensors to accurately perceive the 360° environment around the vehicle. The development of autonomous driving systems requires multiple steps such as data collection, data aggregation, cleaning and labeling, model training, simulation, and big data analysis. This process involves the aggregation and storage of massive data, data flow between different systems in different links, and Read and write massive amounts of data during model training. Data faces new challenges with storage bottlenecks.

To this end, the technologies and capabilities of many cloud service providers in this area have become the key to helping car companies win. For example, Amazon Cloud Technology AWS uses the autonomous driving data lake as the center to help car companies build an end-to-end autonomous driving data closed loop. Use Amazon Simple Storage Service (Amazon S3, cloud object storage service) to build an autonomous driving data lake to achieve data collection, data management and analysis, data annotation, model and algorithm development, simulation verification, map development, DevOps and MLOps, and car companies It can more easily realize the development, testing and application of the entire process of autonomous driving.

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Source: AWS

Among domestic technology giants, take Baidu data closed-loop solution as an example, its data storage provides Data retrieval service for roadside and vehicle multi-source data information, used for massive data search on the business platform, with multi-dimensional retrieval (vehicle information, mileage, autonomous driving duration, etc.), management of the entire life cycle from data production to destruction, Supports advantages such as panoramic data view, data traceability, and open data sharing.

Baidu autonomous driving data closed-loop solution architecture

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Source: Baidu

2. Efficient development of autonomous driving requires the construction of a data closed-loop system

The development of autonomous driving has shifted from technology-driven to data-driven, but data-driven business models face many difficulties.

It is difficult to process massive data: The amount of data collected by high-level autonomous driving test vehicles every day is TB level, and the development team needs PB level storage space, but these data can be used for training The value data accounts for less than 5%. In addition, there are strict safety compliance requirements for data collected by sensors such as vehicle cameras, lidar, and high-precision positioning, which undoubtedly poses great challenges to the access, storage, desensitization, and processing of massive data.

The cost of data annotation is high: Data annotation takes up a lot of manpower and time costs. With the development of high-level autonomous driving capabilities, the complexity of scenarios continues to increase, and more difficult scenarios will appear. Improving the accuracy of the vehicle perception model places higher requirements on the scale and quality of the training data set. Traditional manual annotation has been unable to meet the demand for massive data sets for model training in terms of efficiency and cost.

Simulation test efficiency is low: Virtual simulation is an effective means to accelerate the training of autonomous driving algorithms, but simulation scenarios are difficult to construct and have low degree of restoration, especially some complex and dangerous scenarios, which are difficult to construct. In addition, the parallel simulation capability is insufficient, the efficiency of simulation testing is low, and the iteration cycle of the algorithm is too long.

High-precision map coverage is low: High-precision maps mainly rely on self-collection and self-made maps, and only meet the scenarios of designated roads in the experimental stage. In the future, it will be commercialized and expanded to urban streets in major cities across the country. It will face very prominent challenges in terms of coverage, dynamic updates, as well as cost and efficiency.

In order to solve various difficulties and problems, efficient development of autonomous driving requires the construction of an efficient data closed-loop system.


Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Source: Freetech

As far as the autonomous driving data closed loop is concerned, in Corner Cases need to be constantly solved during the implementation of autonomous driving. For this purpose, sufficient data samples and convenient vehicle-side verification methods must be available. Shadow mode is one of the best solutions for solving Corner Cases.

Shadow mode was proposed by Tesla in April 2019 and applied to the vehicle to compare relevant decisions and trigger data upload. The self-driving software on the sold vehicle is used to continuously record the data detected by the sensors and selectively transmit it back at the appropriate time for machine learning and improvement of the original self-driving algorithm.

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven

Dojo supercomputer can use massive video data for unsupervised annotation and training.

In 2021, Tesla delivered 936,200 vehicles globally, of which 484,100 were delivered by the Chinese factory. In the first half of 2022, 560,000 vehicles will be delivered. Tesla takes advantage of mass production and continuously optimizes algorithms through shadow mode. Using shadow mode, millions of sold vehicles are used as test vehicles to capture surrounding perceptions and special road conditions, and continuously strengthen the ability to predict, avoid, and learn from uncertain events. Because there are millions of sold vehicles to support it, the coverage of Corner Cases and extreme working conditions will be more comprehensive. The high-quality data collected by flexible triggering can iterate out better algorithms, and the excellence of algorithm iteration determines The value of software. In terms of software upgrade subscription services, the explosive power of data closed loop has just emerged.

3. Data closed loop becomes the core of iterative upgrade of autonomous driving

The premise of continuous iteration of the autonomous driving system is the continuous optimization of the algorithm, and the algorithm The excellence depends on the efficiency of the data closed-loop system. The efficient flow of data in each scenario of autonomous driving development is crucial. Data intelligence will become the key to accelerating the mass production of autonomous driving.

In December 2021, HaoMo Zhixing officially released MANA Xuehu, China’s first autonomous driving data intelligence system, which accelerates autonomous driving from the five major capabilities of perception, cognition, annotation, simulation, and calculation. The evolution of driving technology. In the next three years, the assisted driving system can be installed on more than 1 million passenger cars. Relying on its fully self-developed autonomous driving system, Haomo Zhixing has achieved significant advantages in the accumulation, processing, and application of data. Massive data brings the advantage of technological iteration. The advantages of cost reduction and efficiency increase are obvious.

For another example, Momenta has achieved leading full-process data-driven technical capabilities, including algorithm modules such as perception, fusion, prediction, and control, which can be iterated efficiently in a data-driven manner. with updates. Its Closed Loop Automation is a set of tool chains that allow data flow to drive automatic iteration of data-driven algorithms. CLA can automatically filter out massive amounts of golden data, drive the automatic iteration of algorithms, and make the self-driving flywheel spin faster and faster.

Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-drivenSource: Momenta

In the context of software-defined cars, data, algorithms and computing power are the troika of autonomous driving development. The research and development cycles of car companies are shortened and function iterations are accelerated. In the future, being able to continuously collect data at low cost, high efficiency and high efficiency, and iterate algorithms through real data to ultimately form a data closed loop and business closed loop is the key to the sustainable development of autonomous driving companies.

The above is the detailed content of Data closed-loop research: The development of autonomous driving shifts from technology-driven to data-driven. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
在 CARLA自动驾驶模拟器中添加真实智体行为在 CARLA自动驾驶模拟器中添加真实智体行为Apr 08, 2023 pm 02:11 PM

arXiv论文“Insertion of real agents behaviors in CARLA autonomous driving simulator“,22年6月,西班牙。由于需要快速prototyping和广泛测试,仿真在自动驾驶中的作用变得越来越重要。基于物理的模拟具有多种优势和益处,成本合理,同时消除了prototyping、驾驶员和弱势道路使用者(VRU)的风险。然而,主要有两个局限性。首先,众所周知的现实差距是指现实和模拟之间的差异,阻碍模拟自主驾驶体验去实现有效的现实世界

特斯拉自动驾驶算法和模型解读特斯拉自动驾驶算法和模型解读Apr 11, 2023 pm 12:04 PM

特斯拉是一个典型的AI公司,过去一年训练了75000个神经网络,意味着每8分钟就要出一个新的模型,共有281个模型用到了特斯拉的车上。接下来我们分几个方面来解读特斯拉FSD的算法和模型进展。01 感知 Occupancy Network特斯拉今年在感知方面的一个重点技术是Occupancy Network (占据网络)。研究机器人技术的同学肯定对occupancy grid不会陌生,occupancy表示空间中每个3D体素(voxel)是否被占据,可以是0/1二元表示,也可以是[0, 1]之间的

一文通览自动驾驶三大主流芯片架构一文通览自动驾驶三大主流芯片架构Apr 12, 2023 pm 12:07 PM

当前主流的AI芯片主要分为三类,GPU、FPGA、ASIC。GPU、FPGA均是前期较为成熟的芯片架构,属于通用型芯片。ASIC属于为AI特定场景定制的芯片。行业内已经确认CPU不适用于AI计算,但是在AI应用领域也是必不可少。 GPU方案GPU与CPU的架构对比CPU遵循的是冯·诺依曼架构,其核心是存储程序/数据、串行顺序执行。因此CPU的架构中需要大量的空间去放置存储单元(Cache)和控制单元(Control),相比之下计算单元(ALU)只占据了很小的一部分,所以CPU在进行大规模并行计算

自动驾驶汽车激光雷达如何做到与GPS时间同步?自动驾驶汽车激光雷达如何做到与GPS时间同步?Mar 31, 2023 pm 10:40 PM

gPTP定义的五条报文中,Sync和Follow_UP为一组报文,周期发送,主要用来测量时钟偏差。 01 同步方案激光雷达与GPS时间同步主要有三种方案,即PPS+GPRMC、PTP、gPTPPPS+GPRMCGNSS输出两条信息,一条是时间周期为1s的同步脉冲信号PPS,脉冲宽度5ms~100ms;一条是通过标准串口输出GPRMC标准的时间同步报文。同步脉冲前沿时刻与GPRMC报文的发送在同一时刻,误差为ns级别,误差可以忽略。GPRMC是一条包含UTC时间(精确到秒),经纬度定位数据的标准格

特斯拉自动驾驶硬件 4.0 实物拆解:增加雷达,提供更多摄像头特斯拉自动驾驶硬件 4.0 实物拆解:增加雷达,提供更多摄像头Apr 08, 2023 pm 12:11 PM

2 月 16 日消息,特斯拉的新自动驾驶计算机,即硬件 4.0(HW4)已经泄露,该公司似乎已经在制造一些带有新系统的汽车。我们已经知道,特斯拉准备升级其自动驾驶硬件已有一段时间了。特斯拉此前向联邦通信委员会申请在其车辆上增加一个新的雷达,并称计划在 1 月份开始销售,新的雷达将意味着特斯拉计划更新其 Autopilot 和 FSD 的传感器套件。硬件变化对特斯拉车主来说是一种压力,因为该汽车制造商一直承诺,其自 2016 年以来制造的所有车辆都具备通过软件更新实现自动驾驶所需的所有硬件。事实证

端到端自动驾驶中轨迹引导的控制预测:一个简单有力的基线方法TCP端到端自动驾驶中轨迹引导的控制预测:一个简单有力的基线方法TCPApr 10, 2023 am 09:01 AM

arXiv论文“Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline“, 2022年6月,上海AI实验室和上海交大。当前的端到端自主驾驶方法要么基于规划轨迹运行控制器,要么直接执行控制预测,这跨越了两个研究领域。鉴于二者之间潜在的互利,本文主动探索两个的结合,称为TCP (Trajectory-guided Control Prediction)。具

一文聊聊SLAM技术在自动驾驶的应用一文聊聊SLAM技术在自动驾驶的应用Apr 09, 2023 pm 01:11 PM

定位在自动驾驶中占据着不可替代的地位,而且未来有着可期的发展。目前自动驾驶中的定位都是依赖RTK配合高精地图,这给自动驾驶的落地增加了不少成本与难度。试想一下人类开车,并非需要知道自己的全局高精定位及周围的详细环境,有一条全局导航路径并配合车辆在该路径上的位置,也就足够了,而这里牵涉到的,便是SLAM领域的关键技术。什么是SLAMSLAM (Simultaneous Localization and Mapping),也称为CML (Concurrent Mapping and Localiza

一文聊聊自动驾驶中交通标志识别系统一文聊聊自动驾驶中交通标志识别系统Apr 12, 2023 pm 12:34 PM

什么是交通标志识别系统?汽车安全系统的交通标志识别系统,英文翻译为:Traffic Sign Recognition,简称TSR,是利用前置摄像头结合模式,可以识别常见的交通标志 《 限速、停车、掉头等)。这一功能会提醒驾驶员注意前面的交通标志,以便驾驶员遵守这些标志。TSR 功能降低了驾驶员不遵守停车标志等交通法规的可能,避免了违法左转或者无意的其他交通违法行为,从而提高了安全性。这些系统需要灵活的软件平台来增强探测算法,根据不同地区的交通标志来进行调整。交通标志识别原理交通标志识别又称为TS

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.