Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.-AI-php.cn

Home

Technology peripherals

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

王林

Feb 07, 2024 pm 03:30 PM

aidata

The latest generation of language models (such as GPT-4, PaLM and LLaMa) have made important breakthroughs in natural language processing and generation. These large-scale models are capable of tasks ranging from writing Shakespearean sonnets to summarizing complex medical reports and even solving competition-level programming problems. While these models are capable of solving a diverse range of problems, they are not always correct. Sometimes they may generate inaccurate, misleading, or contradictory response results. Therefore, when using these models, care still needs to be taken to evaluate and verify the accuracy and reliability of their outputs.

As the cost of running models decreases, people are beginning to consider using scaffolding systems and multi-language model queries to improve the accuracy and stability of model output. This approach optimizes model performance and provides a better experience for users.

This research from Stanford and OpenAI proposes a new technology that can be used to improve the power and performance of language models called meta-prompting.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

Paper title: Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper address: https://arxiv.org/abs/2401.12954
Project address: https://github.com/suzgunmirac/meta-prompting

This technology involves building a high-level "meta" prompt, which The function is to instruct the language model to do the following:

1. Decompose complex tasks or problems into smaller sub-tasks that are easy to solve;

2. Assign these subtasks to specialized "expert" models using appropriate and detailed natural language instructions;

3. Supervise the communication between these expert models;

4. Apply their own critical thinking, reasoning, and verification skills through this process.

For a language model that can be effectively called using meta-prompting, the model acts as a conductor when queried. It outputs a message history (or narrative) consisting of responses from multiple expert models. This language model is first responsible for generating the commander part of the message history, which includes the selection of experts and the construction of specific instructions for them. However, the same language model also acts as an independent expert in its own right, generating output based on expertise and information selected by the commander for each specific query.

This approach allows a single unified language model to maintain a coherent line of reasoning while also leveraging a variety of expert roles. By dynamically selecting context for prompting, these experts can bring a fresh perspective to the process, while the commander model maintains a bird's-eye view of the complete history and maintains coordination.

Therefore, this approach allows a single black box language model to effectively serve as both a central commander and a series of different experts, resulting in more accurate, reliable and consistent response.

The newly proposed meta-prompting technology here combines and expands a variety of different prompting ideas proposed in recent research, including high-level planning and decision-making, dynamic personality allocation, and multi-agent Debate, self-debug and self-reflection.

A key aspect of meta-prompting is its property of being task-agnostic.

Unlike traditional scaffolding methods that require specific instructions or examples to be tailored to each task, meta-prompting uses the same high-level hierarchy across multiple tasks and inputs. instruction. This versatility is especially beneficial for trouble-shy users, since it eliminates the need to provide detailed examples or specific instructions for each specific task.

For example, for a one-time request like "Write a Shakespearean sonnet about taking a selfie," users don't need to supplement it with high-quality examples of neoclassical poetry.

meta-prompting methods can improve the usefulness of language models by providing a broad and flexible framework without compromising their specificity or relevance. In addition, to demonstrate the versatility and integration capabilities of the meta-prompting method, the team also enhanced its system so that it can call the Python interpreter. This will allow the technology to support more dynamic and comprehensive applications, further increasing its potential to efficiently handle a wide range of tasks and queries.

Figure 2 shows an example of a meta-prompting conversation flow.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

It depicts the Meta Model (Commander Model) using input and execution from multiple different professional expert models or codes Output is the process of interpreting its own output. This configuration makes meta-prompting a nearly universal tool. It allows the interactions and computations of multiple language models to be aggregated into a single and coherent narrative. Meta-prompting is different in that it lets the language model decide for itself which prompts to use or which snippets to use.

The team conducted comprehensive experiments using GPT-4 as the base language model, comparing meta-prompting with other task-independent scaffolding methods.

Experiments have found that meta-prompting can not only improve overall performance, but also often achieve new best results on multiple different tasks. Its flexibility is particularly noteworthy: the commander model has the ability to call on the expert model (which is basically itself, with different instructions) to perform a variety of different functions. These functions may include reviewing previous output, choosing a specific AI persona for a specific task, optimizing the generated content, and ensuring that the final output meets required standards in both substance and form.

As shown in Figure 1, compared with the previous methods, the new method has obvious improvements.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

meta-prompting

Intuitive knowledge and abstract overview. Meta-prompting works by using a model to coordinate and execute multiple independent queries, then combining their responses to render a final response. In principle, this mechanism adopts an integrated approach that borrows the power and diversity of independent professional models to collaboratively solve and handle multi-faceted tasks or problems.

The core of the meta-prompting strategy is its shallow structure, which uses a single model (called the metamodel) as the authoritative master entity.

This prompting structure is similar to an orchestra, in which the role of the conductor is played by a meta-model, and each musical player corresponds to a different domain-specific model. Just as a conductor can coordinate multiple instruments to play a harmonious melody, a metamodel can combine answers and insights from multiple models to provide accurate and comprehensive answers to complex questions or tasks.

Conceptually, within this framework, domain-specific experts can take many forms, such as language models fine-tuned for specific tasks, used to handle specific types of queries A dedicated API, or even a calculation tool like a calculator or a coding tool like a Python interpreter for executing code. These functionally diverse experts are instructed and unified under the supervision of the meta-model and cannot directly interact or communicate with each other.

Algorithmic Procedure. Algorithm 1 gives the pseudocode of the newly proposed meta-prompting method.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

To briefly summarize, the first step is to transform the input so that it conforms to the appropriate template; then the following loop is executed: (a) to the meta model Submit the prompt, (b) use domain-specific expert models if necessary, (c) return a final response, (d) handle errors.

It should be pointed out that the meta-model and expert model used by the team in the experiment are both GPT-4. The difference in their roles is determined by the instructions each receives; where the meta-model follows the set of instructions provided in Figure 3, and the expert model follows the instructions dynamically determined by the meta-model at inference time.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

Experimental setup

Benchmark

##The team compared meta-prompting with the following prompting methods Irrelevant task-type zero-sample version:

##Datasets and tasks

The team used a variety of tasks and data sets in their experiments that require a variety of different abilities, such as mathematical and algorithmic reasoning, domain-specific knowledge, and literary creativity. These datasets and tasks include:

Game of 24: The goal is to use four given values (each can only be used once) to construct an arithmetic expression that results in 24 Mode.

Three BIG-Bench Hard (BBH) tasks: Geometric Shapes, MultiStep Arithmetic Two and Word Sorting; there is also an inference task Checkmate-in taken directly from the BIG-Bench suite -One.
Python Programming Puzzles (P3), which are Python programming questions, include multiple difficulties.
Multilingual Grade School Math is a multilingual version of the GSM8K dataset that includes Bengali, Japanese, and Swahili.
Shakespearean Sonnet Writing, a new task created by the team, aims to write ten sonnets that rhyme strictly with "ABAB CDCD EFEF GG" A four-line poem, which should contain the three words provided verbatim.

Answer extraction and evaluation protocol

As shown in Figure 3, for the newly proposed meta- prompting method, system instructions will encourage the meta-model to give the final answer in a specific format.

As for evaluation, one of the following three indicators will be used, depending on the nature and form of the task:

Exact Match ( EM), Exact Match

Soft Match (SM), Soft Match
Functionally Correct (FC), Functional Correctness

Models and Inference

The team’s main experiments all used GPT-4 (gpt-4-32k) . Some additional experiments used GPT-3.5 (gpt-35-turbo). Whether it is GPT-3.5 or GPT-4, the following instructions are used for fine-tuning.

In all experiments, the parameters and system instructions used by the meta-model are the same. The temperature value is set to 0, the top-p value is set to 0.95, and the maximum number of tokens is 1024.

Main Results and Discussion

Table 1 summarizes the experimental results, and the superiority of the newly proposed meta-prompting is reflected.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born. Looking at the overall performance of these methods on all tasks, we can see that meta-prompting brings significant improvements to accuracy, especially when using When assisted by the Python interpreter tool.

Specifically, the meta-prompting method outperforms the standard prompting method by 17.1%, exceeds expert (dynamic) prompting by 17.3%, and is also 15.2% better than multi-person prompting.

In addition, we can see from Figures 4 and 5 that compared to meta-prompting without using the Python interpreter, when integrating the Python interpreter, the overall performance on different tasks can be obtained 11.5% improvement.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.

Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born. The team also discusses in depth in the paper key insights gained from the experiments, including meta- The performance superiority of prompting, zero-sample decomposition capability, error detection, information aggregation and code execution, etc. We won’t go into details here, but the concept of Fresh Eyes is worth introducing.

Fresh Eyes, or seeing with another pair of eyes, helps alleviate a well-known problem with language models: making mistakes goes all the way to the end and exhibits overconfidence.

Fresh Eyes is a key difference between meta-prompting and multiplayer prompting, and experimental results have also proven its advantages. In meta-prompting, experts (or personas) can be used to re-evaluate the problem. This approach offers the opportunity to gain new insights, potentially uncovering answers that have not been found to be incorrect before.

Based on cognitive psychology, Fresh Eyes can lead to more creative problem solving and error detection results.

The examples below demonstrate the benefits of Fresh Eyes in practice. Suppose the task is Game of 24. The values provided are 6, 11, 12, and 13. You are required to construct an arithmetic expression that results in 24 and use each number only once. Its history might look something like this:

1. The metamodel proposes consulting expert models that solve mathematical problems and programming in Python. It emphasizes the need for accuracy and compliance with constraints and recommends involving another expert if necessary.

#2. One expert gives a solution, but another expert thinks it is wrong, so the meta-model suggests writing a Python program to find a valid solution.

3. Consult a programming expert and ask him to write a program.

4. Another programming expert finds an error in the script, modifies it and executes the modified script.

5. Consult a math expert to verify the solution output by the program.

6. After the verification is completed, the meta-model will output it as the final answer.

This example shows how meta-prompting can incorporate new perspectives at every step, not only to find answers, but also to effectively identify and correct errors.

The team concluded by discussing some other issues related to meta-prompting, including an analysis of the type of experts used, the number of dialogue turns needed to get the final result, and how to deal with no Solving problems, etc. Please refer to the original paper for details.

The above is the detailed content of Stanford and OpenAI proposed meta-prompting, and the strongest zero-sample prompting technology was born.. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

An easy-to-understand explanation of how to save conversation history (conversation log) in ChatGPT!May 16, 2025 am 05:41 AM

Various ways to efficiently save ChatGPT dialogue records Have you ever thought about saving a ChatGPT-generated conversation record? This article will introduce a variety of saving methods in detail, including official functions, Chrome extensions and screenshots, etc., to help you make full use of ChatGPT conversation records. Understand the characteristics and steps of various methods and choose the one that suits you best. [Introduction to the latest AI proxy "OpenAI Operator" released by OpenAI] (The link to OpenAI Operator should be inserted here) Table of contents Save conversation records using ChatGPT Export Steps to use the official export function Save ChatGPT logs using Chrome extension ChatGP

Create a schedule with ChatGPT! Explaining prompts that can be used to create and adjust tablesMay 16, 2025 am 05:40 AM

Modern society has a compact pace and efficient schedule management is crucial. Work, life, study and other tasks are intertwined, and prioritization and schedules are often a headache. Therefore, intelligent schedule management methods using AI technology have attracted much attention. In particular, ChatGPT's powerful natural language processing capabilities can automate tedious schedules and task management, significantly improving productivity. This article will explain in-depth how to use ChatGPT for schedule management. We will combine specific cases and steps to demonstrate how AI can improve daily life and work efficiency. In addition, we will discuss things to note when using ChatGPT to ensure safe and effective use of this technology. Experience ChatGPT now and get your schedule

How to connect ChatGPT with spreadsheets! A thorough explanation of what you can doMay 16, 2025 am 05:39 AM

We will explain how to link Google Sheets and ChatGPT to improve business efficiency. In this article, we will explain in detail how to use the add-on "GPT for Sheets and Docs" that is easy for beginners to use. No programming knowledge is required. Increased business efficiency through ChatGPT and spreadsheet integration This article will focus on how to connect ChatGPT with spreadsheets using add-ons. Add-ons allow you to easily integrate ChatGPT features into your spreadsheets. GPT for Shee

6 Investor Predictions For AI In 2025May 16, 2025 am 05:37 AM

There are overarching trends and patterns that experts are highlighting as they forecast the next few years of the AI revolution. For instance, there's a significant demand for data, which we will discuss later. Additionally, the need for energy is d

Use ChatGPT for writing! A thorough explanation of tips and examples of prompts!May 16, 2025 am 05:36 AM

ChatGPT is not just a text generation tool, it is a true partner that dramatically increases writers' creativity. By using ChatGPT for the entire writing process, such as initial manuscript creation, ideation ideas, and stylistic changes, you can simultaneously save time and improve quality. This article will explain in detail the specific ways to use ChatGPT at each stage, as well as tips for maximizing productivity and creativity. Additionally, we will examine the synergy that combines ChatGPT with grammar checking tools and SEO optimization tools. Through collaboration with AI, writers can create originality with free ideas

How to create graphs in ChatGPT! No plugins required, so it can be used for Excel too!May 16, 2025 am 05:35 AM

Data visualization using ChatGPT: From graph creation to data analysis Data visualization, which conveys complex information in an easy-to-understand manner, is essential in modern society. In recent years, due to the advancement of AI technology, graph creation using ChatGPT has attracted attention. In this article, we will explain how to create graphs using ChatGPT in an easy-to-understand manner even for beginners. We will introduce the differences between the free version and the paid version (ChatGPT Plus), specific creation steps, and how to display Japanese labels, along with practical examples. Creating graphs using ChatGPT: From basics to advanced use ChatG

Pushing The Limits Of Modern LLMs With A Dinner Plate?May 16, 2025 am 05:34 AM

In general, we know that AI is big, and getting bigger. It’s fast, and getting faster. Specifically, though, not everyone’s familiar with some of the newest hardware and software approaches in the industry, and how they promote better results. Peopl

Archive your ChatGPT conversation history! Explaining the steps to save and how to restore itMay 16, 2025 am 05:33 AM

ChatGPT Dialogue Record Management Guide: Efficiently organize and make full use of your treasure house of knowledge! ChatGPT dialogue records are a source of creativity and knowledge, but how can growing records be effectively managed? Is it time-consuming to find important information? don’t worry! This article will explain in detail how to effectively "archive" (save and manage) your ChatGPT conversation records. We will cover official archive functions, data export, shared links, and data utilization and considerations. Table of contents Detailed explanation of ChatGPT's "archive" function How to use ChatGPT archive function Save location and viewing method of ChatGPT archive records Cancel and delete methods for ChatGPT archive records Cancel archive Delete the archive Summarize Ch

See all articles