


Huawei HiSilicon Canada Research Institute and the University of Alberta jointly launched a neural network performance prediction framework based on pre-training and knowledge injection.
The performance evaluation of neural networks (precision, recall, PSNR, etc.) requires a lot of resources and time and is the main bottleneck of neural network structure search (NAS). Early NAS methods required extensive resources to train each new structure searched from scratch. In recent years, network performance predictors are attracting more attention as an efficient performance evaluation method.
However, current predictors are limited in their scope of use because they can only model network structures from a specific search space and can only predict the performance of new structures on specific tasks. For example, the training samples only contain classification networks and their accuracy, so that the trained predictors can only be used to evaluate the performance of new network structures on image classification tasks.
In order to break this boundary and enable the predictor to predict the performance of a certain network structure on multiple tasks and have cross-task and cross-data generalization capabilities, Huawei HiSilicon Canada Research Institute and the University of Alberta jointly introduced a neural network performance prediction framework based on pre-training and knowledge injection. This framework can quickly evaluate the performance of different structures and types of networks on many different types of CV tasks such as classification, detection, segmentation, etc. for neural network structure search. Research paper has been accepted by AAAI 2023.
- Paper link: https://arxiv.org/abs/2211.17228
- Code link: https://github.com/Ascend -Research/AIO-P
The AIO-P (All-in-One Predictors) approach aims to extend the scope of neural predictors to computer vision tasks beyond classification. AIO-P utilizes K-Adapter technology to inject task-related knowledge into the predictor model, and also designs a label scaling mechanism based on FLOPs (Floating Point Operands) to adapt to different performance indicators and distributions. AIO-P uses a unique pseudo-labeling scheme to train K-Adapters, generating new training samples in just minutes. Experimental results show that AIO-P exhibits strong performance prediction capabilities and achieves excellent MAE and SRCC results on several computer vision tasks. In addition, AIO-P can directly migrate and predict the performance of never-before-seen network structures, and can cooperate with NAS to optimize the calculation amount of existing networks without reducing performance.
Method Introduction
AIO-P is a general network performance predictor that can be generalized to multiple tasks. AIO-P achieves performance prediction capabilities across tasks and search spaces through predictor pre-training and domain-specific knowledge injection. AIO-P uses K-Adapter technology to inject task-related knowledge into the predictor, and relies on a common computational graph (CG) format to represent a network structure, ultimately enabling it to support networks from different search spaces and tasks, as shown in Figure 1 below. shown.
Figure 1. How AIO-P represents the network structure used for different tasks
In addition, the pseudo-marking mechanism The use of AIO-P can quickly generate new training samples to train K-Adapters. To bridge the gap between performance measurement ranges on different tasks, AIO-P proposes a label scaling method based on FLOPs to achieve cross-task performance modeling. Extensive experimental results show that AIO-P is able to make accurate performance predictions on a variety of different CV tasks, such as pose estimation and segmentation, without requiring training samples or with only a small amount of fine-tuning. Additionally, AIO-P can correctly rank performance on never-before-seen network structures and, when combined with a search algorithm, is used to optimize Huawei's facial recognition network, keeping its performance unchanged and reducing FLOPs by more than 13.5%. The paper has been accepted by AAAI-23 and the code has been open sourced on GitHub.
Computer vision networks usually consist of a "backbone" that performs feature extraction and a "head" that uses the extracted features to make predictions. The structure of the "backbone" is usually designed based on a certain known network structure (ResNet, Inception, MobileNet, ViT, UNet), while the "head" is designed for a given task, such as classification, pose estimation, segmentation, etc. Designed. Traditional NAS solutions manually customize the search space based on the structure of the "backbone". For example, if the "backbone" is MobileNetV3, the search space may include the number of MBConv Blocks, the parameters of each MBConv (kernel size, expansion), the number of channels, etc. However, this customized search space is not universal. If there is another "backbone" designed based on ResNet, it cannot be optimized through the existing NAS framework, but the search space needs to be redesigned.
In order to solve this problem, AIO-P chose to represent different network structures from the computational graph level, achieving a unified representation of any network structure. As shown in Figure 2, the computational graph format allows AIO-P to encode the header and backbone together to represent the entire network structure. This also allows AIO-P to predict the performance of networks from different search spaces (such as MobileNets and ResNets) on various tasks.
Figure 2. Representation of the Squeeze-and-Excite module in MobileNetV3 at the computational graph level
Proposed in AIO-P The predictor structure starts from a single GNN regression model (Figure 3, green block), which predicts the performance of the image classification network. To add to it the knowledge of other CV tasks, such as detection or segmentation, the study attached a K-Adapter (Fig. 3, orange block) to the original regression model. The K-Adapter is trained on samples from the new task, while the original model weights are frozen. Therefore, this study separately trains multiple K-Adapters (Figure 4) to add knowledge from multiple tasks.
Figure 3. AIO-P predictor with a K-Adapter
Figure 4. AIO-P predictor with multiple K-Adapters
In order to further reduce the cost of training each K-Adapter, this study proposes a clever pseudo-labeling technology . This technique uses a latent sampling scheme to train a "head" model that can be shared between different tasks. The shared head can then be paired with any network backbone in the search space and fine-tuned to generate pseudo-labels in 10-15 minutes (Figure 5).
Figure 5. Training a "head" model that can be shared between different tasks
It has been proven by experiments that using shared heads The pseudo-labels obtained are positively correlated with the actual performance obtained by training a network from scratch for a day or more, sometimes with a rank correlation coefficient exceeding 0.5 (Spearman correlation).
In addition, different tasks will have different performance indicators. These performance indicators usually have their own specific distribution interval. For example, a classification network using a specific backbone may have a classification accuracy of about 75% on ImageNet, while the mAP on the MS-COCO object detection task may be 30-35 %. To account for these different intervals, this study proposes a method to understand network performance from a normal distribution based on the normalization concept. In layman's terms, if the predicted value is 0, the network performance is average; if > 0, it is a better network;
Figure 6. How to normalize network performance
The FLOPs of a network are related to model size, input data, and are generally positively correlated with performance Related trends. This study uses FLOPs transformations to enhance the labels that AIO-P learns from.
Experiments and Results
This study first trained AIO-P on human pose estimation and object detection tasks, and then used it to predict the performance of network structures on multiple tasks, including pose estimation ( LSP and MPII), detection (OD), instance segmentation (IS), semantic segmentation (SS) and panoramic segmentation (PS). Even in the case of zero-shot direct migration, use AIO-P to predict the performance of networks from the Once-for-All (OFA) search space (ProxylessNAS, MobileNetV3 and ResNet-50) on these tasks, and the final prediction results A MAE of less than 1.0% and a ranking correlation of over 0.5 were achieved.
In addition, this study also used AIO-P to predict the performance of networks in the TensorFlow-Slim open source model library (such as DeepLab semantic segmentation model, ResNets, Inception nets, MobileNets and EfficientNets), these network structures may not have appeared in the training samples of AIO-P.
AIO-P can achieve almost perfect SRCC on 3 DeepLab semantic segmentation model libraries, obtain positive SRCC on all 4 classification model libraries, and achieve SRCC=1.0 on the EfficientNet model by utilizing FLOPs transformation. .
Finally, the core motivation of AIO-P is to be able to pair it with a search algorithm and use it to optimize arbitrary network structures, which can be independent and not belong to any The structure of a search space or library of known models, or even one for a task that has never been trained on. This study uses AIO-P and the random mutation search algorithm to optimize the face recognition (FR) model used on Huawei mobile phones. The results show that AIO-P can reduce the model calculation FLOPs by more than 13.5% while maintaining performance (precision (Pr) and recall (Rc)).
Interested readers can read the original text of the paper to learn more research details.
The above is the detailed content of Breaking NAS bottlenecks, new method AIO-P predicts architecture performance across tasks. For more information, please follow other related articles on the PHP Chinese website!

译者 | 布加迪审校 | 孙淑娟目前,没有用于构建和管理机器学习(ML)应用程序的标准实践。机器学习项目组织得不好,缺乏可重复性,而且从长远来看容易彻底失败。因此,我们需要一套流程来帮助自己在整个机器学习生命周期中保持质量、可持续性、稳健性和成本管理。图1. 机器学习开发生命周期流程使用质量保证方法开发机器学习应用程序的跨行业标准流程(CRISP-ML(Q))是CRISP-DM的升级版,以确保机器学习产品的质量。CRISP-ML(Q)有六个单独的阶段:1. 业务和数据理解2. 数据准备3. 模型

thinkphp是国产框架。ThinkPHP是一个快速、兼容而且简单的轻量级国产PHP开发框架,是为了简化企业级应用开发和敏捷WEB应用开发而诞生的。ThinkPHP从诞生以来一直秉承简洁实用的设计原则,在保持出色的性能和至简的代码的同时,也注重易用性。

什么是 celery这次我们来介绍一下 Python 的一个第三方模块 celery,那么 celery 是什么呢? celery 是一个灵活且可靠的,处理大量消息的分布式系统,可以在多个节点之间处理某个任务; celery 是一个专注于实时处理的任务队列,支持任务调度; celery 是开源的,有很多的使用者; celery 完全基于 Python 语言编写;所以 celery 本质上就是一个任务调度框架,类似于 Apache 的 airflow,当然 airflow 也是基于 Python

AI就像一个黑匣子,能自己做出决定,但是人们并不清楚其中缘由。建立一个AI模型,输入数据,然后再输出结果,但有一个问题就是我们不能解释AI为何会得出这样的结论。需要了解AI如何得出某个结论背后的原因,而不是仅仅接受一个在没有上下文或解释的情况下输出的结果。可解释性旨在帮助人们理解:如何学习的?学到了什么?针对一个特定输入为什么会做出如此决策?决策是否可靠?在本文中,我将介绍6个用于可解释性的Python框架。SHAPSHapleyAdditiveexplanation(SHapleyAdditi

AOP(面向切面编程)是一种编程思想,用于解耦业务逻辑和横切关注点(如日志、权限等)。在PHP中,使用AOP框架可以简化编码,提高代码可维护性和可扩展性。本文将介绍在PHP中使用AOP框架的基本原理和实现方法。一、AOP的概念和原理面向切面编程,指的是将程序的业务逻辑和横切关注点分离开来,通过AOP框架来实现统一管理。横切关注点指的是在程序中需要重复出现并且

已安装Microsoft.NET版本4.5.2、4.6或4.6.1的MicrosoftWindows用户如果希望Microsoft将来通过产品更新支持该框架,则必须安装较新版本的Microsoft框架。据微软称,这三个框架都将在2022年4月26日停止支持。支持日期结束后,产品将不会收到“安全修复或技术支持”。大多数家庭设备通过Windows更新保持最新。这些设备已经安装了较新版本的框架,例如.NETFramework4.8。未自动更新的设备可能

如果你在Windows11上安装了2022年5月累积更新,你可能已经注意到你一直使用的许多应用程序都不像以前那样工作了。强制性安全更新KB5013943正在使某些使用.NET框架的应用程序崩溃。在某些情况下,用户会收到错误代码:0xc0000135。可选更新中报告了类似的问题,但并不普遍。随着2022年5月的更新,该错误似乎已进入生产渠道,这次有更多用户受到影响。崩溃在使用.NETFramework的应用程序中很常见,Discord或MicrosoftTeams等

据悉GPT-4将于本周发布,多模态将成为其一大亮点。当前的大语言模型正在成为理解各种模态的通用接口,能够根据不同模态信息来给出回复文本,但大语言模型生成的内容也仅仅局限于文本。另一方面,当前的扩散模型DALL・E2、Imagen、StableDiffusion等在视觉创作上掀起一场革命,但这些模型仅仅支持文到图的单一跨模态功能,离通用式生成模型还有一定距离。而多模态大模型将能够打通各种模态能力,实现任意模态之间转化,被认为是通用式生成模型的未来发展方向。清华大学计算机系朱军教授带领的TSAI


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Dreamweaver CS6
Visual web development tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

WebStorm Mac version
Useful JavaScript development tools

Atom editor mac version download
The most popular open source editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.
