Better than all methods! HIMap: End-to-end vectorized HD map construction-AI-php.cn

Home

Technology peripherals

Better than all methods! HIMap: End-to-end vectorized HD map construction

王林

Mar 19, 2024 pm 03:00 PM

framemap

Vectorized high-definition (HD) map construction requires predicting the categories and point coordinates of map elements (such as road boundaries, lane dividers, crosswalks, etc.). State-of-the-art methods are mainly based on point-level representation learning for regressing precise point coordinates. However, this pipeline has limitations in obtaining element-level information and handling element-level failures, such as wrong element shapes or entanglements between elements. In order to solve the above problems, this paper proposes a simple and effective HybrId framework, named HIMap, to fully learn and interact with point-level and element-level information.

Specifically, a hybrid representation called HIQuery is introduced to represent all map elements, and a point element interactor is proposed to interactively extract hybrid information of elements, such as point location and element shape and encode it into HIQuery. In addition, point-element consistency constraints are also proposed to enhance the consistency between point-level and element-level information. Finally, the output point elements of the integrated HIQuery can be directly converted into the class, point coordinates and mask of the map element. Extensive experiments are conducted on nuScenes and Argoverse2 datasets, showing consistently superior results to previous methods. It is worth noting that the method achieves 77.8mAP on the nuScenes dataset, which is significantly better than the previous SOTA by at least 8.3mAP!

Paper name: HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

Paper link: https://arxiv.org/pdf/2403.08639.pdf

HIMap first introduces a hybrid representation called HIQuery to represent all map elements in the map. It is a set of learnable parameters that can be iteratively updated and refined by interacting with BEV features. Then, a multi-layer hybrid decoder is designed to encode the hybrid information of map elements (such as point position, element shape) into HIQuery and perform point element interaction, see Figure 2. Each layer of the hybrid decoder includes point element interactors, self-attention and FFN. Inside the point-element interactor, a mutual interaction mechanism is implemented to realize the exchange of point-level and element-level information and avoid the learning bias of single-level information. Finally, integrated HIQuery's output point elements can be directly converted to the element's point coordinates, class, and mask. In addition, point-element consistency constraints are also proposed to enhance the consistency between point-level and element-level information.

Better than all methods! HIMap: End-to-end vectorized HD map construction

HIMap Framework Overview

The overall process of HIMap is shown in Figure 3(a). HIMap is compatible with a variety of airborne sensor data, such as RGB images from multi-view cameras, point clouds from lidar, or multi-modal data. Here we take multi-view RGB images as an example to explain how HIMap works.

Better than all methods! HIMap: End-to-end vectorized HD map construction

BEV Feature Extractor is a tool for extracting BEV features from multi-view RGB images. Its core includes extracting the backbone part of multi-scale 2D features from each perspective, obtaining the FPN part of single-scale features by fusing and refining multi-scale features, and utilizing the 2D to BEV feature conversion module to map 2D features into BEV features. . This process helps convert image information into BEV features more suitable for processing and analysis, improving the usability and accuracy of features. Through this method, we can better understand and utilize the information in multi-view images, providing stronger support for subsequent data processing and decision-making.

HIQuery: In order to fully learn the point-level and element-level information of map elements, HIQuery is introduced to represent all elements in the map!

Hybrid decoder: The hybrid decoder produces integrated HIQuery by iteratively interacting HIQuery Qh with BEV features X.

The goal of the point element interactor is to interactively extract point-level and element-level information of map elements and encode it into HIQuery. The motivation for the interaction of the two levels of information comes from their complementarity. Point-level information contains local location knowledge, while element-level information provides global shape and semantic knowledge. This interaction thus enables mutual refinement of local and global information of map elements.

Considering the original difference between point-level representation and element-level representation, which focus on local information and overall information respectively, the learning of two-level representations may also interfere with each other. This will increase the difficulty of information interaction and reduce the effectiveness of information interaction. Therefore, point element consistency constraints are introduced to enhance the consistency between each point level and element level information, and the discriminability of elements can also be enhanced!

Comparison of experimental results

The paper conducted experiments on NuScenes Dataset and Argoverse2 Dataset!

Comparison of SOTA model on nuScenes val-set:

Better than all methods! HIMap: End-to-end vectorized HD map construction

Comparison of SOTA model on Argoverse2 val set:

Better than all methods! HIMap: End-to-end vectorized HD map construction

Comparison with SOTA model under nuScenes validation set multi-modal data:

Better than all methods! HIMap: End-to-end vectorized HD map construction

More ablation experiments:

Better than all methods! HIMap: End-to-end vectorized HD map construction

The above is the detailed content of Better than all methods! HIMap: End-to-end vectorized HD map construction. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

解读CRISP-ML（Q）：机器学习生命周期流程Apr 08, 2023 pm 01:21 PM

译者 | 布加迪审校 | 孙淑娟目前，没有用于构建和管理机器学习（ML）应用程序的标准实践。机器学习项目组织得不好，缺乏可重复性，而且从长远来看容易彻底失败。因此，我们需要一套流程来帮助自己在整个机器学习生命周期中保持质量、可持续性、稳健性和成本管理。图1. 机器学习开发生命周期流程使用质量保证方法开发机器学习应用程序的跨行业标准流程（CRISP-ML（Q））是CRISP-DM的升级版，以确保机器学习产品的质量。CRISP-ML（Q）有六个单独的阶段：1. 业务和数据理解2. 数据准备3. 模型

thinkphp是不是国产框架Sep 26, 2022 pm 05:11 PM

thinkphp是国产框架。ThinkPHP是一个快速、兼容而且简单的轻量级国产PHP开发框架，是为了简化企业级应用开发和敏捷WEB应用开发而诞生的。ThinkPHP从诞生以来一直秉承简洁实用的设计原则，在保持出色的性能和至简的代码的同时，也注重易用性。

Python 强大的任务调度框架 Celery！Apr 12, 2023 pm 09:55 PM

什么是 celery这次我们来介绍一下 Python 的一个第三方模块 celery，那么 celery 是什么呢？ celery 是一个灵活且可靠的，处理大量消息的分布式系统，可以在多个节点之间处理某个任务； celery 是一个专注于实时处理的任务队列，支持任务调度； celery 是开源的，有很多的使用者； celery 完全基于 Python 语言编写；所以 celery 本质上就是一个任务调度框架，类似于 Apache 的 airflow，当然 airflow 也是基于 Python

AI就像一个黑匣子，能自己做出决定，但是人们并不清楚其中缘由。建立一个AI模型，输入数据，然后再输出结果，但有一个问题就是我们不能解释AI为何会得出这样的结论。需要了解AI如何得出某个结论背后的原因，而不是仅仅接受一个在没有上下文或解释的情况下输出的结果。可解释性旨在帮助人们理解：如何学习的?学到了什么?针对一个特定输入为什么会做出如此决策?决策是否可靠?在本文中，我将介绍6个用于可解释性的Python框架。SHAPSHapleyAdditiveexplanation(SHapleyAdditi

如何在PHP中使用AOP框架May 19, 2023 pm 01:21 PM

AOP（面向切面编程）是一种编程思想，用于解耦业务逻辑和横切关注点（如日志、权限等）。在PHP中，使用AOP框架可以简化编码，提高代码可维护性和可扩展性。本文将介绍在PHP中使用AOP框架的基本原理和实现方法。一、AOP的概念和原理面向切面编程，指的是将程序的业务逻辑和横切关注点分离开来，通过AOP框架来实现统一管理。横切关注点指的是在程序中需要重复出现并且

Microsoft .NET Framework 4.5.2、4.6 和 4.6.1 将于 2022 年 4 月终止支持Apr 17, 2023 pm 02:25 PM

已安装Microsoft.NET版本4.5.2、4.6或4.6.1的MicrosoftWindows用户如果希望Microsoft将来通过产品更新支持该框架，则必须安装较新版本的Microsoft框架。据微软称，这三个框架都将在2022年4月26日停止支持。支持日期结束后，产品将不会收到“安全修复或技术支持”。大多数家庭设备通过Windows更新保持最新。这些设备已经安装了较新版本的框架，例如.NETFramework4.8。未自动更新的设备可能

KB5013943 2022 年 5 月更新使 Windows 11 上的应用程序崩溃Apr 16, 2023 pm 10:52 PM

如果你在Windows11上安装了2022年5月累积更新，你可能已经注意到你一直使用的许多应用程序都不像以前那样工作了。强制性安全更新KB5013943正在使某些使用.NET框架的应用程序崩溃。在某些情况下，用户会收到错误代码：0xc0000135。可选更新中报告了类似的问题，但并不普遍。随着2022年5月的更新，该错误似乎已进入生产渠道，这次有更多用户受到影响。崩溃在使用.NETFramework的应用程序中很常见，Discord或MicrosoftTeams等

朱军团队在清华开源了首个基于Transformer的多模态扩散大型模型，经过文本和图像改写全部完成。May 08, 2023 pm 08:34 PM

据悉GPT-4将于本周发布，多模态将成为其一大亮点。当前的大语言模型正在成为理解各种模态的通用接口，能够根据不同模态信息来给出回复文本，但大语言模型生成的内容也仅仅局限于文本。另一方面，当前的扩散模型DALL・E2、Imagen、StableDiffusion等在视觉创作上掀起一场革命，但这些模型仅仅支持文到图的单一跨模态功能，离通用式生成模型还有一定距离。而多模态大模型将能够打通各种模态能力，实现任意模态之间转化，被认为是通用式生成模型的未来发展方向。清华大学计算机系朱军教授带领的TSAI

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

4 weeks agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

4 weeks agoByDDD

Two Point Museum: All Exhibits And Where To Find Them

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Atom editor mac version download

The most popular open source editor

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Hot Topics

Where is the login entrance for gmail email?

7378

1628

1357

1267

1216