


Self-supervised learning enables computers to observe the world and understand it by learning the structure of images, speech, or text. This has driven many of the recent major advances in artificial intelligence.
Despite the considerable efforts that researchers around the world have invested in this area, there are currently large differences in the way self-supervised learning algorithms learn from images, speech, text and other modalities. Therefore, the artificial intelligence forum Analytics India Magazine launches the top ten self-supervised learning models in 2022 for the readers.
Data2vec
Paper link: https://arxiv.org/pdf/2202.03555.pdf
Open source code: https://t.co/3x8VCwGI2x pic.twitter.com/Q9TNDg1paj
Meta AI released the data2vec algorithm in January for speech, image and text related computer vision model. According to the AI team, the model is highly competitive in NLP tasks.
It does not use contrastive learning or reconstruction that relies on input examples. The Meta AI team stated that the training method of data2vec is to represent the predictive model by providing a partial view of the input data.
The team said: "We first encode the masked training samples in the student model. After that, in the same model, we encode the unmasked input samples to build the training target. This model (teacher model) and the student model only differ in parameters."
This model predicts the model representation of unmasked training samples based on the masked training samples. This eliminates the dependence on modality-specific objectives in the learning task.
ConvNext
Paper link: https://arxiv.org/pdf/2201.03545.pdf
Open source code: https://t.co/nWx2KFtl7X
ConvNext, also called ConvNet model for the 2020s, is a model released by the Meta AI team in March Model. It is entirely based on ConvNet's modules and is therefore accurate, simple in design, and scalable.
##Paper link: https:// t.co/H7crDPHCHV
Open source code: https://t.co/oadSBT61P3
Variance-invariant covariance regularization (VICReg) combines the variance terms and Decorrelation mechanism based on redundancy reduction and covariance regularization to avoid the collapse of the encoder producing constant or uninformative vectors.
VICReg does not require techniques such as weight sharing between branches, batch normalization, feature normalization, output quantization, stopping gradients, memory banks, etc., and performs well on several downstream tasks The results achieved are comparable to the state of the art. Furthermore, it has been experimentally demonstrated that the variance regularization term can stabilize the training of other methods and promote performance improvements.
STEGO
Paper link: https://arxiv.org/abs/2203.08414
MIT’s Computer Science and Artificial Intelligence Laboratory collaborated with Microsoft and Cornell University to develop the Self-supervised Transformer for Energy-Based Graph Optimization (STEGO) to solve one of the most difficult tasks in computer vision. : Assign a label to every pixel of an image without human supervision.
#STEGO learned "semantic segmentation" - simply put, assigning a label to each pixel in the image.
Semantic segmentation is an important skill for today's computer vision systems because images may be interfered by objects. To make matters more difficult, these objects don't always fit within the text box. Algorithms are often better suited to discrete “things” like people and cars than to hard-to-quantify things like vegetation, the sky, and mashed potatoes.
Take the scene of dogs playing in the park as an example. Previous systems may only be able to identify dogs, but by assigning a label to each pixel of the image, STEGO can decompose the image into several main components: Dog , sky, grass and its owner.
Machines that can "see the world" are crucial to a variety of emerging technologies, such as self-driving cars and predictive models for medical diagnosis. Since STEGO can learn without labels, it can detect objects in different domains, even objects that humans do not yet fully understand.
CoBERT
Paper link: https://arxiv.org/pdf/2210.04062.pdf
For self-supervised speech representation learning, researchers from the Chinese University of Hong Kong (Shenzhen) proposed Code BERT (CoBERT). Unlike other self-distillation methods, their model predicts representations from different modalities. The model converts speech into a sequence of discrete codes for representation learning.
First, the research team used the HuBERT pre-trained code model to train in discrete space. They then refined the code model into a speech model, aiming to perform better learning across modalities. The significant improvement on the ST task suggests that CoBERT's representations may carry more linguistic information than previous work.
CoBERT outperforms the performance of the best current algorithms on ASR tasks and brings significant improvements in the SUPERB Speech Translation (ST) task.
FedX
##Paper link: https://arxiv.org/pdf/ 2202.00758.pdf
Researchers at Nokia Bell Labs, in collaboration with Georgia Institute of Technology and the University of Cambridge, have developed ColloSSL, a collaborative self-supervised algorithm for human activity recognition.
Unlabeled sensor data sets captured simultaneously by multiple devices can be viewed as natural transformations of each other, which then generate signals for representation learning. This paper proposes three methods - device selection, contrastive sampling and multi-view contrastive loss.
LoRot
Paper link: https://arxiv.org/pdf/2207.10023.pdf
Sungkyunkwan A university research team proposes a simple self-supervised auxiliary task that predicts localizable rotations (LoRot) with three attributes to assist in supervising the target.
This model has three major characteristics. First, the research team guided the model to learn rich features. Second, distributed training does not change significantly while the self-supervision transition occurs. Third, the model is lightweight and versatile and has high adaptability to previous technologies.
TS2Vec
The above is the detailed content of 2022 Top10 self-supervised learning models released! Eight achievements of the United States and China dominate the list. For more information, please follow other related articles on the PHP Chinese website!

特斯拉是一个典型的AI公司,过去一年训练了75000个神经网络,意味着每8分钟就要出一个新的模型,共有281个模型用到了特斯拉的车上。接下来我们分几个方面来解读特斯拉FSD的算法和模型进展。01 感知 Occupancy Network特斯拉今年在感知方面的一个重点技术是Occupancy Network (占据网络)。研究机器人技术的同学肯定对occupancy grid不会陌生,occupancy表示空间中每个3D体素(voxel)是否被占据,可以是0/1二元表示,也可以是[0, 1]之间的

译者 | 朱先忠审校 | 孙淑娟在我之前的博客中,我们已经了解了如何使用因果树来评估政策的异质处理效应。如果你还没有阅读过,我建议你在阅读本文前先读一遍,因为我们在本文中认为你已经了解了此文中的部分与本文相关的内容。为什么是异质处理效应(HTE:heterogenous treatment effects)呢?首先,对异质处理效应的估计允许我们根据它们的预期结果(疾病、公司收入、客户满意度等)选择提供处理(药物、广告、产品等)的用户(患者、用户、客户等)。换句话说,估计HTE有助于我

译者 | 朱先忠审校 | 孙淑娟引言模型超参数(或模型设置)的优化可能是训练机器学习算法中最重要的一步,因为它可以找到最小化模型损失函数的最佳参数。这一步对于构建不易过拟合的泛化模型也是必不可少的。优化模型超参数的最著名技术是穷举网格搜索和随机网格搜索。在第一种方法中,搜索空间被定义为跨越每个模型超参数的域的网格。通过在网格的每个点上训练模型来获得最优超参数。尽管网格搜索非常容易实现,但它在计算上变得昂贵,尤其是当要优化的变量数量很大时。另一方面,随机网格搜索是一种更快的优化方法,可以提供更好的

导读:因果推断是数据科学的一个重要分支,在互联网和工业界的产品迭代、算法和激励策略的评估中都扮演者重要的角色,结合数据、实验或者统计计量模型来计算新的改变带来的收益,是决策制定的基础。然而,因果推断并不是一件简单的事情。首先,在日常生活中,人们常常把相关和因果混为一谈。相关往往代表着两个变量具有同时增长或者降低的趋势,但是因果意味着我们想要知道对一个变量施加改变的时候会发生什么样的结果,或者说我们期望得到反事实的结果,如果过去做了不一样的动作,未来是否会发生改变?然而难点在于,反事实的数据往往是

SimCLR(Simple Framework for Contrastive Learning of Representations)是一种学习图像表示的自监督技术。 与传统的监督学习方法不同,SimCLR 不依赖标记数据来学习有用的表示。 它利用对比学习框架来学习一组有用的特征,这些特征可以从未标记的图像中捕获高级语义信息。SimCLR 已被证明在各种图像分类基准上优于最先进的无监督学习方法。 并且它学习到的表示可以很容易地转移到下游任务,例如对象检测、语义分割和小样本学习,只需在较小的标记

一、盒马供应链介绍1、盒马商业模式盒马是一个技术创新的公司,更是一个消费驱动的公司,回归消费者价值:买的到、买的好、买的方便、买的放心、买的开心。盒马包含盒马鲜生、X 会员店、盒马超云、盒马邻里等多种业务模式,其中最核心的商业模式是线上线下一体化,最快 30 分钟到家的 O2O(即盒马鲜生)模式。2、盒马经营品类介绍盒马精选全球品质商品,追求极致新鲜;结合品类特点和消费者购物体验预期,为不同品类选择最为高效的经营模式。盒马生鲜的销售占比达 60%~70%,是最核心的品类,该品类的特点是用户预期时

译者 | 李睿 审校 | 孙淑娟随着机器学习成为人们每天都在使用的很多应用程序的一部分,人们越来越关注如何识别和解决机器学习模型的安全和隐私方面的威胁。 然而,不同机器学习范式面临的安全威胁各不相同,机器学习安全的某些领域仍未得到充分研究。尤其是强化学习算法的安全性近年来并未受到太多关注。 加拿大的麦吉尔大学、机器学习实验室(MILA)和滑铁卢大学的研究人员开展了一项新研究,主要侧重于深度强化学习算法的隐私威胁。研究人员提出了一个框架,用于测试强化学习模型对成员推理攻击的脆弱性。 研究

1.线性回归线性回归(Linear Regression)可能是最流行的机器学习算法。线性回归就是要找一条直线,并且让这条直线尽可能地拟合散点图中的数据点。它试图通过将直线方程与该数据拟合来表示自变量(x 值)和数值结果(y 值)。然后就可以用这条线来预测未来的值!这种算法最常用的技术是最小二乘法(Least of squares)。这个方法计算出最佳拟合线,以使得与直线上每个数据点的垂直距离最小。总距离是所有数据点的垂直距离(绿线)的平方和。其思想是通过最小化这个平方误差或距离来拟合模型。例如


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Zend Studio 13.0.1
Powerful PHP integrated development environment

Atom editor mac version download
The most popular open source editor

SublimeText3 Chinese version
Chinese version, very easy to use