Home > Article > Technology peripherals > OlaGPT, the first thinking framework that simulates human cognition: six modules enhance the language model and increase reasoning capabilities by up to 85%
When ChatGPT was first released, it gave us too much shock. The model's performance in dialogue was so human-like that it created the illusion that the language model has "thinking ability."
However, after gaining an in-depth understanding of language models, researchers have gradually discovered that the reproduction based on high-probability language patterns is still far from the expected "general artificial intelligence".
In most current research, large-scale language models mainly generate thinking chains under the guidance of specific prompts to perform reasoning tasks, without considering the human cognitive framework, which makes the language model unable to solve complex reasoning problems. There is still a significant gap with humans.
When humans face complex reasoning problems, they usually use various cognitive abilities and need to interact with all aspects of tools, knowledge, and external environmental information. Can language models simulate human thinking? What about processes to solve complex problems?
The answer is of course yes! The first model OlaGPT that simulates the human cognitive processing framework is here!
Paper link: https://arxiv.org/abs/2305.16334
Code link: https://www.php.cn/link/ 73a1c863a54653d5e184b790fee14754
OlaGPT includes multiple cognitive modules, including attention, memory, reasoning, learning, and corresponding scheduling and decision-making mechanisms; inspired by human active learning, the framework also includes a learning unit to record previous Errors and expert opinions, and dynamic reference to improve the ability to solve similar problems.
The article also outlines a common and effective reasoning framework for human problem-solving, and designs a Chain of Thought (CoT) template accordingly; it also proposes a comprehensive decision-making mechanism that can Maximize model accuracy.
Experimental results obtained after rigorous evaluation on multiple inference datasets show that OlaGPT surpasses previous state-of-the-art benchmarks and proves its effectiveness.
There is still a big gap between the current language model and the expected general artificial intelligence. The main manifestations are:
1. In some cases The content generated is meaningless, or deviates from human value preferences, or even gives some very dangerous suggestions. The current solution is to introduce reinforcement learning with human feedback (RLHF) to sort the model output.
2. The language model’s knowledge is limited to concepts and facts explicitly mentioned in the training data.
When faced with complex problems, language models cannot adapt to changing environments, use existing knowledge or tools, reflect on historical lessons, decompose problems, and use the knowledge summarized by humans in the long-term evolution like humans. Thinking patterns (such as analogies, inductive reasoning, deductive reasoning, etc.) to solve problems.
However, there are still many system problems in allowing language models to simulate the process of human brain processing problems:
1. How to systematically imitate and encode the main modules in the human cognitive framework while making it possible to Implemented in a way that schedules according to common human reasoning patterns?
2. How to guide language models to actively learn like humans, that is, learn and develop from historical mistakes or expert solutions to difficult problems?
While it may be feasible to retrain the model to encode corrected answers, it is obviously costly and inflexible.
3. How to make language models flexibly utilize various thinking modes evolved by humans to improve their reasoning performance?
A fixed, universal thinking model is difficult to adapt to different problems. Just like when humans face different types of problems, they usually flexibly choose different ways of thinking, such as analogical reasoning, deductive reasoning, etc.
OlaGPT is a problem-solving framework that simulates human thinking and can enhance the capabilities of large language models.
OlaGPT draws on the cognitive architecture theory and models the core capabilities of the cognitive framework as attention, memory, learning, reasoning, and action Action selection.
The researchers fine-tuned the framework according to the needs of specific implementation and proposed a process suitable for language models to solve complex problems, which specifically includes six modules: intention enhancement module (attention), memory module ( memory), active learning module (learning), reasoning module (reasoning), controller module (action selection) and voting module.
Attention is an important part of human cognition, identifying relevant information and filtering out irrelevant data.
Similarly, the researchers designed a corresponding attention module for the language model, namely intent enhancement, which aims to extract the most relevant information and establish a stronger correlation between the user input and the model's language pattern, which can be It is regarded as an optimized converter from user expression habits to model expression habits.
First obtain the question types of LLMs in advance through specific prompt words, and then reconstruct the way of asking questions.
For example, add the sentence "Now give you the XX (question type), question and choices:" at the beginning of the question; in order to facilitate analysis, you also need to add " The answer must end with JSON format: Answer: one of options[A,B,C,D,E].」
The memory module plays a vital role in storing various knowledge base information. Studies have proven the limitations of current language models in understanding the latest factual data, and the memory module focuses on consolidating knowledge that has not been internalized by the model. and stores it in an external library as long-term memory.
The researchers used the memory function of langchain for short-term memory, and then used the Faiss-based vector database to achieve long-term memory.
During the query process, its search function can extract relevant knowledge from the library, covering four types of memory libraries: facts, tools, notes and thinking, where facts are real-world information, Such as common sense, etc.; tools include search engines, calculators and Wikipedia, which can assist language models in completing some work that does not require editing; notes mainly record some difficult cases and steps to solve problems; the thinking library mainly stores human problem-solving written by experts Thinking template, the expert can be a human or a model.
The ability to learn is crucial for humans to continuously improve their self-performance. In essence, all forms of learning rely on experience, and language models can learn from previous Learn from your mistakes to quickly improve your reasoning abilities.
First, researchers identify problems that the language model cannot solve; then record the insights and explanations provided by experts in the note library; and finally select relevant notes to promote the language model learning so that similar problems can be dealt with more effectively.
The purpose of the reasoning module is to create multiple agents based on the human reasoning process, thereby stimulating the potential thinking ability of the language model and solving reasoning problems.
This module combines multiple thinking templates with reference to specific thinking types such as lateral thinking, sequential thinking, critical thinking and integrative thinking to facilitate reasoning tasks.
The controller module is mainly used to handle related action selections, including the internal planning tasks of the model (such as selecting certain modules for execution) and the processing of facts and tools. Choose from , notes and thought banks.
Relevant libraries are first retrieved and matched. The retrieved content is then integrated into a template agent, requiring the language model to provide responses under a template in an asynchronous manner. , just like humans may have difficulty identifying all relevant information at the beginning of reasoning, it is equally difficult to expect language models to do this from the beginning.
Therefore, dynamic retrieval is implemented based on the user's questions and the intermediate reasoning progress, using the Faiss method to create embedded indexes for the above four libraries, in which the retrieval strategies of each library are slightly different.
Since different thinking templates may be more suitable for different types of problems, the researchers designed the voting module to improve the integrated calibration ability between multiple thinking templates and make more A voting strategy to generate the best answer to improve performance.
Specific voting methods include:
1. Language model voting: Guide the language model to select the most consistent answer among multiple given options and provide a reason.
2. regex voting: Use regular expression exact matching to extract answers to obtain voting results.
In order to evaluate the effectiveness of the enhanced language model framework in reasoning tasks, the researchers conducted a comprehensive experimental comparison on two types of reasoning data sets.
It can be seen from the results:
1. SC (self-consistency) performs better than GPT-3.5-turbo, indicating that integration is adopted to a certain extent methods really help improve the effectiveness of large-scale models.
2. The performance of the method proposed in this article exceeds SC, which proves the effectiveness of the thinking template strategy to a certain extent.
The answers to different thinking templates show considerable differences, and voting under different thinking templates will ultimately produce better results than simply conducting multiple rounds of voting.
3. Different thinking templates have different effects, and step-by-step solutions may be more suitable for reasoning problems.
4. The performance of the active learning module is significantly better than the zero-sample method.
Containing challenging cases as part of the note library and using random, retrieval and combination lists can improve performance, which is a feasible strategy.
5. Different retrieval schemes have different effects on different data sets. In general, the combination strategy has better results.
6. The method in this article is obviously better than other solutions. This is due to the reasonable design of the overall framework, including the effective design of the active learning module; the thinking template realizes the adaptation to different models, and the results under different thinking templates are different; the controller module plays a very good control role and selects content that matches the required content; the integration method of different thinking templates designed by the voting module is effective.
Reference materials:
https://www.php.cn/link/73a1c863a54653d5e184b790fee14754
The above is the detailed content of OlaGPT, the first thinking framework that simulates human cognition: six modules enhance the language model and increase reasoning capabilities by up to 85%. For more information, please follow other related articles on the PHP Chinese website!