New work by Andrew Ng's team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning-AI-php.cn

New work by Andrew Ng's team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 19, 2024 pm 08:58 PM

getting Startedcontextual learningManyICL

New work by Andrew Ngs team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning

The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com

## This study evaluates the advanced multi-modal basic model on 10 data sets Multi-sample context learning on ,revealing sustained performance improvements. Batch queries significantly reduce per-example latency and inference cost without sacrificing performance. These findings demonstrate that

leveraging a large set of demonstration examples allows rapid adaptation to new tasks and domains without the need for traditional fine-tuning.

New work by Andrew Ngs team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning

Paper address: https://arxiv.org/abs/2405.09798
Code address: https://github.com/stanfordmlgroup/ManyICL

##Background introduction

In recent research on Multimodal Foundation Model, In-Context Learning (ICL) has been proven to be one of the effective methods to improve model performance.

However, limited by the context length of the basic model, especially for multi-modal basic models that require a large number of visual tokens to represent images, existing related research is only limited to Yu provides a small sample in context.

Excitingly, recent technological advances have greatly increased the context length of models, which opens up the possibility of exploring context learning using more examples.

Based on this, the latest research of Stanford Ng's team -

ManyICL, mainly evaluates the performance of the most advanced multi-modal basic model from a few samples (less than 100) to multi-sample (up to 2000) performance in context learning

. By testing data sets from multiple domains and tasks, the team verified the significant effect of multi-sample context learning in improving model performance, and explored the impact of batch queries on performance, cost, and latency.

Comparison between Many-shot ICL and zero-sample and few-sample ICL.

Overview of Methods

Three types were selected for this study Advanced multi-modal base models:

GPT-4o, GPT4 (V)-Turbo and Gemini 1.5 Pro

. Due to the superior performance of GPT-4o, the research team focuses on GPT-4o and Gemini 1.5 Pro in the main text. Please view the relevant content of GPT4 (V)-Turbo in the appendix. In terms of data sets, the research team collected 10 data across different fields (including natural imaging, medical imaging, remote sensing imaging and molecular imaging, etc.) and tasks (including multi-classification, multi-label classification and fine-grained classification). Extensive experiments were conducted on the set.

New work by Andrew Ngs team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning

Benchmark data set summary.

To test the impact of increasing the number of examples on model performance, the research team gradually increased the number of examples provided in the context, up to nearly 2,000 examples. At the same time, considering the high cost and high latency of multi-sample learning, the research team also explored the impact of batch processing of queries. Here, batch query refers to processing multiple queries in a single API call.

Experimental results

Multi-sample context learning performance evaluation

Overall performance: Multi-shot context learning with nearly 2000 examples outperforms few-shot learning on all datasets. The performance of the Gemini 1.5 Pro model shows a consistent log-linear improvement as the number of examples increases, while the performance of GPT-4o is less stable.

Data efficiency: The study measured the model’s contextual learning data efficiency, which is how quickly the model learns from examples. The results show that Gemini 1.5 Pro shows higher context learning data efficiency than GPT-4o on most data sets, meaning that it can learn from examples more effectively.

Impact of batch queries

Overall performance: In Combine multiple queries into one request without degrading performance in zero-sample and multi-sample scenarios under optimal sample set size selection. It is worth noting that in the zero-shot scenario, a single query performs poorly on many datasets. In contrast, batch queries can even improve performance.

Performance improvement in zero-sample scenario: For some data sets (such as UCMerced), batch query significantly improves performance in zero-sample scenario . The research team analyzed that this is mainly due to domain calibration, class calibration and self-learning (self-ICL).

Cost and latency analysis

Multi-sample context learning although it needs to be processed during inference Longer input context, but significantly lower per-example latency and inference cost with batched queries. For example, on the HAM10000 dataset, using the Gemini 1.5 Pro model for a batch query of 350 examples, the latency dropped from 17.3 seconds to 0.54 seconds and the cost dropped from $0.842 to $0.0877 per example.

Conclusion

The research results show that multi-sample context learning can significantly improve multi-modal The performance of state-of-the-art base models, especially the Gemini 1.5 Pro model, has shown continued performance improvements on multiple data sets, allowing it to more effectively adapt to new tasks and domains without the need for traditional fine-tuning.

Secondly, batch processing of queries can reduce inference cost and latency while achieving similar or even better model performance, showing great potential in practical applications.

Overall, this research by Andrew Ng’s team opens up a new path for the application of multi-modal basic models, especially in terms of rapid adaptation to new tasks and fields. .

The above is the detailed content of New work by Andrew Ng's team: multi-modal and multi-sample context learning, quickly adapting to new tasks without fine-tuning. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Laravel入门教程：从零开始学习最流行的PHP框架Aug 13, 2023 pm 01:21 PM

Laravel入门教程：从零开始学习最流行的PHP框架引言：Laravel是当前最流行的PHP框架之一，它易于上手、功能强大且拥有活跃的开发社区。本文将带您从零开始学习Laravel框架，并提供一些实例代码，帮助您更好地理解和掌握这个强大的工具。第一步：安装Laravel在开始之前，您需要在计算机上安装Laravel框架。最简单的方法是通过Composer进

VUE3入门实例：制作一个简单的图片裁剪器Jun 15, 2023 pm 08:45 PM

Vue.js是一款流行的JavaScript前端框架，目前已经推出了最新的版本——Vue3，新版Vue在性能、体积以及开发体验上均有所提升，受到越来越多的开发者欢迎。本文将介绍如何使用Vue3制作一个简单的图片裁剪器。首先，我们需要创建一个Vue项目并安装所需的插件。可以使用VueCLI来创建项目，也可以手动搭建。这里我们以使用VueCLI的方式为例：#

从入门到精通：掌握go-zero框架Jun 23, 2023 am 11:37 AM

Go-zero是一款优秀的Go语言框架，它提供了一整套解决方案，包括RPC、缓存、定时任务等功能。事实上，使用go-zero建立一个高性能的服务非常简单，甚至可以在数小时内从入门到精通。本文旨在介绍使用go-zero框架构建高性能服务的过程，并帮助读者快速掌握该框架的核心概念。一、安装和配置在开始使用go-zero之前，我们需要安装它并配置一些必要的环境。1

快速入门：使用Go语言函数实现简单的数据可视化功能Aug 02, 2023 pm 04:25 PM

快速入门：使用Go语言函数实现简单的数据可视化功能随着数据的快速增长和复杂性的提高，数据可视化成为了数据分析和数据表达的重要手段。在数据可视化中，我们需要使用合适的工具和技术来将数据转化为易读且易理解的图表或图形。Go语言作为一种高效且易于使用的编程语言，在数据科学领域也有着广泛的应用。本文将介绍如何使用Go语言函数来实现简单的数据可视化功能。我们将使用Go

如何快速入门Beego开发框架？Jun 22, 2023 am 09:15 AM

Beego是一个基于Go语言的开发框架，它提供了一套完整的Web开发工具链，包括路由、模板引擎、ORM等。如果你想快速入门Beego开发框架，以下是一些简单易懂的步骤和建议。第一步：安装Beego和Bee工具安装Beego和Bee工具是开始学习Beego的第一步。你可以在Beego官网上找到详细的安装步骤，也可以使用以下命令来安装：gogetgithub

PHP中的人脸识别入门指南Jun 11, 2023 am 09:16 AM

随着科技的不断发展，人脸识别技术也越来越得到了广泛的应用。而在Web开发领域中，PHP是一种被广泛采用的技术，因此PHP中的人脸识别技术也备受关注。本文将介绍PHP中的人脸识别入门指南，帮助初学者快速掌握这一领域。一、什么是人脸识别技术人脸识别技术是一种基于计算机视觉技术的生物特征识别技术，其主要应用领域包括安防、金融、电商等。人脸识别技术的核心就是对人脸进

Laravel 8：快速入门指南Jun 20, 2023 am 09:37 AM

Laravel是一个流行的PHP框架，它提供了许多工具和功能，以使开发Web应用程序变得更加轻松和快速。Laravel8已经发布，它带来了许多新的功能和改进。在本文中，我们将学习如何快速入门Laravel8。安装Laravel8要安装Laravel8，您需要满足以下要求：PHP>=7.3MySQL>=5.6或MariaDB>=10.

PHP摄像头调用教程：快速入门指南Jul 29, 2023 pm 11:13 PM

PHP摄像头调用教程：快速入门指南引言：在当今的数字时代，摄像头成为了人们生活中不可或缺的设备之一。在Web开发中，如何通过PHP调用摄像头，实现视频流的显示和处理，成为了很多开发者关注的问题。本文将为大家介绍如何快速入门使用PHP来调用摄像头。一、环境准备要使用PHP调用摄像头，我们需要准备以下环境：PHP：确保已经安装了PHP，并且安装了相应的扩展库，如

See all articles