USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks-AI-php.cn

USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Apr 13, 2023 pm 02:46 PM

VisionTask

Currently, the development of semi-supervised learning is in full swing. However, existing semi-supervised learning benchmarks are mostly limited to computer vision classification tasks, excluding consistent and diverse evaluation of classification tasks such as natural language processing and audio processing. In addition, most semi-supervised papers are published by large institutions, and it is often difficult for academic laboratories to participate in advancing the field due to limitations in computing resources.

To this end, researchers from Microsoft Research Asia and researchers from Westlake University, Tokyo Institute of Technology, Carnegie Mellon University, Max Planck Institute and other institutions proposed Unified SSL Benchmark (USB): the first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks.

This paper not only introduces more diverse application fields, but also uses a visual pre-training model for the first time to greatly reduce the verification time of semi-supervised algorithms, making semi-supervised research more convenient for researchers. Especially small research groups are more friendly. Relevant papers have been accepted by NeurIPS 2022, the top international academic conference in the field of artificial intelligence.

USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks

## Article link: https://arxiv.org/pdf/2208.07204.pdf

Code link: https://github.com/microsoft/Semi-supervised-learning

Supervised learning By building models to fit labeled data, neural network models produce competitive results when trained on large amounts of high-quality labeled data using supervised learning.

For example, according to statistics from the Paperswithcode website, on the million-level data set of ImageNet, traditional supervised learning methods can achieve an accuracy of more than 88%. However, obtaining large amounts of labeled data is often time-consuming and laborious.

In order to alleviate the dependence on labeled data, semi-supervised learning (SSL) is committed to utilizing a large amount of unlabeled data when there is only a small amount of labeled data. to improve the generalization of the model. Semi-supervised learning is also one of the important topics of machine learning. Before deep learning, researchers in this field proposed classic algorithms such as semi-supervised support vector machines, entropy regularization, and collaborative training.

Deep semi-supervised learning

With the rise of deep learning, deep semi-supervised learning algorithms have also made great progress. At the same time, technology companies including Microsoft, Google, and Meta have also recognized the huge potential of semi-supervised learning in practical scenarios.

For example, Google uses noisy student training, a semi-supervised algorithm, to improve its search performance [1]. The most representative semi-supervised algorithms currently use cross-entropy loss for training on labeled data, and consistency regularization on unlabeled data to encourage invariant predictions to input perturbations.

For example, the FixMatch[2] algorithm proposed by Google at NeurIPS 2020 uses augmentation anchoring and fixed thresholding technologies to enhance the model to enhance data with different strengths. Generalizability and reducing the impact of noisy pseudo labels. During training, FixMatch filters unlabeled data below a user-provided/pre-defined threshold.

FlexMatch[3], jointly proposed by Microsoft Research Asia and Tokyo Institute of Technology at NeurIPS 2021, takes into account the different learning difficulties between different categories, so it proposes course pseudo-labels ( curriculum pseudo labeling) technology, different thresholds should be used for different categories.

Specifically, for easy-to-learn categories, the model should set a high threshold to reduce the impact of noisy pseudo-labels; for difficult-to-learn categories, the model should set a low threshold to encourage this category fitting. The learning difficulty evaluation of each class depends on the number of unlabeled data samples falling into that class and above a fixed value.

At the same time, researchers from Microsoft Research Asia also collaborated to propose a unified Pytorch-based semi-supervised method code library TorchSSL[4], which provides deep methods and common data in the field. Sets and benchmark results are uniformly supported.

USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks Figure 1: FlexMatch algorithm process

Problems and challenges in the current semi-supervised learning code library

Although the development of semi-supervised learning is in full swing, researchers have noticed that most of the current papers in the semi-supervised direction only focus on computer vision (CV) classification tasks. For other fields, such as natural language processing (NLP) and audio processing (audio), Researchers cannot know whether these algorithms that are effective in CV tasks are still effective in different fields.

In addition, most semi-supervised papers are published by large institutions, and it is often difficult for academic laboratories to participate in promoting the development of this field due to limitations in computing resources. . In general, semi-supervised learning benchmarks currently have the following two problems:

(1) Insufficient diversity. Most of the existing semi-supervised learning benchmarks are limited to CV classification tasks (i.e., CIFAR-10/100, SVHN, STL-10 and ImageNet classification), excluding consistent and diverse evaluation of classification tasks such as NLP, audio, etc., while in NLP The lack of sufficient labeled data in and audio is also a common problem.

(2) Time-consuming and unfriendly to academia. Existing semi-supervised learning benchmarks such as TorchSSL are often time-consuming and environmentally unfriendly as it often requires training deep neural network models from scratch. Specifically, evaluating FixMatch[1] using TorchSSL requires approximately 300 GPU days. Such high training costs make SSL-related research unaffordable for many research laboratories (especially those in academia or small research groups), thus hindering the progress of SSL.

USB: A new benchmark library with diverse tasks and more friendly to researchers

In order to solve the above problems, researchers from Microsoft Research Asia teamed up with Westlake University, Tokyo Researchers from TU, Carnegie Mellon University, Max Planck Institute and other institutions proposed Unified SSL Benchmark (USB), which is the first semi-supervised classification to unify visual, language and audio classification tasks Learning Benchmarks.

Compared with previous semi-supervised learning benchmarks (such as TorchSSL) that only focused on a small number of visual tasks, this benchmark not only introduces more diverse application fields, but also utilizes visual pre-training for the first time. The model (pretrained vision Transformer) greatly reduces the verification time of semi-supervised algorithms (from 7000 GPU hours to 900 GPU hours), making semi-supervised research more friendly to researchers, especially small research groups.

Relevant papers have been accepted by NeurIPS 2022, the top academic conference in the field of international artificial intelligence. (Click "Read the original text" to learn more)

Solution provided by USB

So, how can USB solve the problems of the current semi-supervised benchmarks in one go? ? The researchers mainly made the following improvements:

(1) To enhance task diversity, USB introduced 5 CV data sets, 5 NLP data sets and 5 audio data sets. and provides a diverse and challenging benchmark that enables consistent evaluation of multiple tasks from different domains. Table 1 provides a detailed comparison of tasks and training time between USB and TorchSSL.

USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks

Table 1: Task and training time comparison between USB and TorchSSL frameworks

(2) In order to improve training efficiency, researchers introduced pre-trained vision Transformer into SSL instead of training ResNets from scratch. Specifically, the researchers found that using pre-trained models can significantly reduce the number of training iterations without affecting performance (e.g., reducing the number of training iterations for a CV task from 1 million steps to 200,000 steps).

(3) In order to be more friendly to researchers, researchers have implemented 14 SSL algorithms as open source and open sourced a modular code library and related configuration files for researchers to easily reproduce the results in the USB report. To get started quickly, USB also provides detailed documentation and tutorials. In addition, USB also provides the pip package for users to directly call the SSL algorithm. The researchers promise to continue to add new algorithms (such as unbalanced semi-supervised algorithms, etc.) and more challenging data sets to USB in the future. Table 2 shows the algorithms and modules already supported in USB.

USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks

Table 2: Supported algorithms and modules in USB

Semi-supervised learning has important research and application value in the future by utilizing large amounts of unlabeled data to train more accurate and robust models. Researchers at Microsoft Research Asia expect that through this USB work, they can help academia and industry make greater progress in the field of semi-supervised learning.

The above is the detailed content of USB: The first semi-supervised classification learning benchmark that unifies visual, language and audio classification tasks. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Gemma Scope: Google's Microscope for Peering into AI's Thought ProcessApr 17, 2025 am 11:55 AM

Exploring the Inner Workings of Language Models with Gemma Scope Understanding the complexities of AI language models is a significant challenge. Google's release of Gemma Scope, a comprehensive toolkit, offers researchers a powerful way to delve in

Who Is a Business Intelligence Analyst and How To Become One?Apr 17, 2025 am 11:44 AM

Unlocking Business Success: A Guide to Becoming a Business Intelligence Analyst Imagine transforming raw data into actionable insights that drive organizational growth. This is the power of a Business Intelligence (BI) Analyst – a crucial role in gu

How to Add a Column in SQL? - Analytics VidhyaApr 17, 2025 am 11:43 AM

SQL's ALTER TABLE Statement: Dynamically Adding Columns to Your Database In data management, SQL's adaptability is crucial. Need to adjust your database structure on the fly? The ALTER TABLE statement is your solution. This guide details adding colu

Business Analyst vs. Data AnalystApr 17, 2025 am 11:38 AM

Introduction Imagine a bustling office where two professionals collaborate on a critical project. The business analyst focuses on the company's objectives, identifying areas for improvement, and ensuring strategic alignment with market trends. Simu

What are COUNT and COUNTA in Excel? - Analytics VidhyaApr 17, 2025 am 11:34 AM

Excel data counting and analysis: detailed explanation of COUNT and COUNTA functions Accurate data counting and analysis are critical in Excel, especially when working with large data sets. Excel provides a variety of functions to achieve this, with the COUNT and COUNTA functions being key tools for counting the number of cells under different conditions. Although both functions are used to count cells, their design targets are targeted at different data types. Let's dig into the specific details of COUNT and COUNTA functions, highlight their unique features and differences, and learn how to apply them in data analysis. Overview of key points Understand COUNT and COU

Chrome is Here With AI: Experiencing Something New Everyday!!Apr 17, 2025 am 11:29 AM

Google Chrome's AI Revolution: A Personalized and Efficient Browsing Experience Artificial Intelligence (AI) is rapidly transforming our daily lives, and Google Chrome is leading the charge in the web browsing arena. This article explores the exciti

AI's Human Side: Wellbeing And The Quadruple Bottom LineApr 17, 2025 am 11:28 AM

Reimagining Impact: The Quadruple Bottom Line For too long, the conversation has been dominated by a narrow view of AI’s impact, primarily focused on the bottom line of profit. However, a more holistic approach recognizes the interconnectedness of bu

5 Game-Changing Quantum Computing Use Cases You Should Know AboutApr 17, 2025 am 11:24 AM

Things are moving steadily towards that point. The investment pouring into quantum service providers and startups shows that industry understands its significance. And a growing number of real-world use cases are emerging to demonstrate its value out

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.