Don't think step by step! Google's latest natural language reasoning algorithm LAMBADA: 'Reverse chain reasoning” is the answer-AI-php.cn

Don't think step by step! Google's latest natural language reasoning algorithm LAMBADA: 'Reverse chain reasoning” is the answer

王林

Apr 12, 2023 pm 09:16 PM

algorithmnatural languagereasoning

Automatic reasoning is definitely a big problem in the field of natural language processing. The model needs to derive effective and correct conclusions based on given premises and knowledge.

Although the field of NLP has achieved extremely high performance in various "natural language understanding" tasks such as reading comprehension and question answering through large-scale pre-trained language models in recent years, these The model's performance in logical reasoning is still lagging.

In May last year, "Chain of Thought" (CoT) came out. Some researchers found that just adding "Let's think step by step" to the prompt can The inference performance of GPT-3 has been greatly improved. For example, in MultiArith, the inference accuracy has been increased from the previous 17.7% to 78.7%

However, methods such as CoT and Selection Inference have It is to search the proof process (proof) from the axioms (axioms) in a forward direction to derive the final conclusion (conclusion). There is a problem of search space combination explosion, so for longer reasoning chains, the failure rate is higher. .

Recently, Google Research developed a Backward Chaining algorithm LAMBADA (LAnguage Model augmented BAckwarD chAining), which combines the "backward reasoning efficiency" derived from classic reasoning literature. "Significantly higher than forward reasoning." This conclusion is applied to language models (LM).

Dont think step by step! Googles latest natural language reasoning algorithm LAMBADA: Reverse chain reasoning” is the answer

Paper link: https://arxiv.org/abs/2212.13894

LAMBADA will reason The process is decomposed into four sub-modules, each of which is implemented by few-shot prompted language model reasoning.

In the end, LAMBADA achieved significant performance improvements on the two logical reasoning data sets compared to the current forward reasoning method of sota, especially when the problem requires depth and accurate proof chain. , the performance improvement of LAMBADA is more obvious.

"Reverse reasoning" becomes the answer?

Logical reasoning, especially logical reasoning on unstructured natural text, is the basic building block for automatic knowledge discovery and the key to future progress in various scientific fields.

Although the development of many NLP tasks has benefited from the increasing scale of pre-trained language models, it has been observed that increasing the size of the model has very limited improvement in solving complex reasoning problems.

In classic literature, there are two main logical reasoning methods:

1. Forward Chaining (FC) ), that is, starting from facts and rules, iterating between making new inferences and adding them to the theory until the target statement can be proven or disproven;

2, backward Backward Chaining (BC) starts from the goal and recursively decomposes it into sub-goals until the sub-goals can be proven or overturned based on facts.

Previous methods of reasoning using language models mostly adopted the idea of forward chain reasoning, which required selecting a subset of facts and rules from the entire set. This is possible for LM is difficult because it requires combinatorial search in a large space.

In addition, deciding when to stop the search and declare the proof failed is also very difficult in FC, sometimes even requiring a module specifically trained on intermediate labels.

In fact, the classic automatic reasoning literature largely focuses on backward chain reasoning or goal-oriented verification strategies.

LAMBADA

LAMBADA means "Language model enhanced by reverse chain technology". The researchers conducted experiments It is proved that BC is more suitable for text-based deductive logical reasoning.

BC does not require a large number of combinatorial searches to select subsets, and has more natural halting criteria.

LAMBADA mainly focuses on automatic reasoning about facts, that is, natural language assertions, such as "Good people are red". These assertions are coherent but not necessarily based on the truth. .

A rule is written by a natural language statement, which can be rewritten in form as "if P then Q", for example "Rough, nice people are red" (Rough, nice people are red) can be Rewritten as "If a person is rough and nice, then they are red" (If a person is rough and nice, then they are red).

P is called the antecedent of the rule, and Q is called the consequent of the rule.

A theory theory C consists of facts F={f1, f2, . . , fn} and rules R={r1, r2, . . , rm}, G represents an idea The goal of proving or disproving based on facts and rules.

Example 1, a theoretical example with fictional characters and rules C

F={"Fiona is a good person","Fiona is a good person" Ona is rough"}

#R={"If someone is smart, then he is a good person", "A rough good person is red", "Being a good person and red means he is round" }.

Based on the above theory, one might want to prove or disprove a goal, such as "Fiona is red?".

Backward chaining reasoning

Whether a rule applies to a goal is determined through an operation called unification in logic.

For example, for the goal "Fiona is red?" in Example 1, the consequences of the second rule are the same as the goal, so it can be applied; but the consequences of the other two rules are different , so not applicable.

Considering the theory and goals in Example 1, BC starts reasoning from the goal "Fiona is red?"

First, BC verifies whether the goal can be proven or disproven from any facts. Since there are no facts to prove or disprove this goal, we next check whether this goal is consistent with the results of any rules, and it is found that it is consistent with the second rule "Rough and good people are red".

Therefore, this goal can be broken down into two sub-goals: 1) Is Fiona rough? and 2) Is Fiona a good person? .

Since both subgoals can be proven from the facts, BC concludes that the original goal can be proven.

For a goal, the result of BC is either proof, denial, or unknown (for example, the goal "Fiona is smart?").

Language Model in LAMBADA

In order to use BC for text-based reasoning, the researchers introduced four LM-based modules: Fact Check , Rule Selection, Goal Decomposition and Sign Agreement.

Dont think step by step! Googles latest natural language reasoning algorithm LAMBADA: Reverse chain reasoning” is the answer

Fact check

Giving the theory Given a set of facts F and a goal G in , the goal is denied).

If such a fact cannot be found, then the truth about G remains unknown.

The implementation of fact checking includes two sub-modules: the first sub-module selects a fact from the set of facts most relevant to the target, and the second sub-module verifies whether the target can be based on this fact be proven or disproven.

Since the fact selection sub-module may not determine the best facts on the first try, if the truth of the target is still unknown after one round of calling the sub-module, the selected facts can be deleted fact, and then call the submodule again; this process can be repeated multiple times.

Rule selection

Given a set of rules R and a goal G in the theory, the rule selection module determines The rules r∈R make the result of r consistent with G, and then these rules are used to decompose the goal into sub-goals.

If such a rule cannot be determined, then the truth of G remains unknown.

Rule selection also includes two sub-modules: the first sub-module determines the result of each rule (independent of the goal), and the second sub-module takes the result and goal of the rule as input, and determine which one aligns with the goal.

It should be noted that due to the recursive nature of BC, the rule selection module may be called multiple times in the process of proving a goal. Since the result of identifying each rule is independent of the target, this submodule only needs to be called once.

Goal decomposition

Given a rule r and a goal G, make the result of r consistent with G, The goal decomposition module determines the subgoals that need to be proven so that G can be proven or disproven.

In the case of successfully proving the antecedent of r, whether the goal is proved or disproven depends on whether the sign of the goal is consistent with the sign of the result of r.

For example, for the goal "Fiona is red?", since the sign of the goal is consistent with the result sign of the second rule, and the antecedent of the rule is proven, it can be concluded that, Goal proven.

Symbolic consistency

Given a rule r and a goal G, the symbolic consistency module verifies the result of r Whether the symbol is consistent or inconsistent with the target's symbol.

Experimental part

The researchers chose Chain of Thought (CoT), the sota neural reasoning method based on explicit reasoning, and the sota module reasoning method Selection Inference (SI) as Compare to baseline model.

The experimental data sets use ProofWriter and PrOntoQA. These data sets are challenging for LM inference, contain examples where chain lengths up to 5 hops need to be proven, and targets cannot be derived from the provided theory. Examples that cannot be refuted.

Dont think step by step! Googles latest natural language reasoning algorithm LAMBADA: Reverse chain reasoning” is the answer

Experimental results show that LAMBADA significantly outperforms the other two baselines, especially on the ProofWriter-PUD dataset containing UNKNOWN labels (44% relative improvement compared to CoT and 44% relative improvement compared to SI at depth -5 56% improvement), and at the higher depths of PrOntoQA (37% relative improvement compared to CoT and 113% improvement compared to SI at depth -5).

Dont think step by step! Googles latest natural language reasoning algorithm LAMBADA: Reverse chain reasoning” is the answer

These results show the advantages of LAMBADA in logical reasoning and also show the backward chaining (in LAMBADA it is Inference backbone) may be a better choice than forward chaining (backbone in SI).

These results also reveal a flaw in the CoT method when dealing with UNKNOWN labels: unlike examples labeled PROVED or DISPROVED, for examples labeled UNKNOWN , there is no natural chain of thinking.

For the deeper (3) proof chain problem, SI produces predictions that are close to the majority class predictions on the three datasets.

It can be found that in the binary case, it tends to over-predict DISPROVED; in the ternary classification case, it tends to over-predict UNKNOWN, which makes it at the depth of PrOntoQA -5 The performance in is even worse than the majority class because there are more PROVED labels than DISPROVED at that depth.

However, the researchers were also surprised to find that CoT's performance on the ProofWriterPD data set was still relatively high, and the accuracy did not decrease.

In summary, LAMBADA has higher inference accuracy on these datasets and is more likely to produce valid conclusions than other techniques that use false proof traces to find correct conclusions. reasoning chain, and is also more query efficient than other LM-based modular reasoning methods.

The results of this experiment strongly suggest that future work on reasoning with LMs should include backward chaining or goal-directed strategies, the researchers said.

Reference:

https://arxiv.org/abs/2212.13894

The above is the detailed content of Don't think step by step! Google's latest natural language reasoning algorithm LAMBADA: 'Reverse chain reasoning” is the answer. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

How to Build Your Personal AI Assistant with Huggingface SmolLMApr 18, 2025 am 11:52 AM

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityApr 18, 2025 am 11:49 AM

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentApr 18, 2025 am 11:44 AM

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Comprehensive Guide to Python Built-in Data Structures - Analytics VidhyaApr 18, 2025 am 11:43 AM

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

First Impressions From OpenAI's New Models Compared To AlternativesApr 18, 2025 am 11:41 AM

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

AI Portfolio | How to Build a Portfolio for an AI Career?Apr 18, 2025 am 11:40 AM

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

What Agentic AI Could Mean For Security OperationsApr 18, 2025 am 11:36 AM

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Google Versus OpenAI: The AI Fight For StudentsApr 18, 2025 am 11:31 AM

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

WebStorm Mac version

Useful JavaScript development tools

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download

The most popular open source editor

Hot Topics

Where is the login entrance for gmail email?

7554

CakePHP Tutorial

1382

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers