Home >Technology peripherals >AI >How many of the three major flaws of LLM do you know?
Science: Far from being an eternally benevolent and beneficial entity, the sentient general AI of the future is likely to be a manipulative sociopath that eats up all your personal data and then collapses when it is needed most.
Translated from 3 Ways LLMs Can Let You Down, author Joab Jackson.
OpenAI is about to release GPT-5, and the outside world has high hopes for it. The most optimistic predictions even believe that it will achieve general artificial intelligence. But at the same time, CEO Sam Altman and his team face a number of serious obstacles in bringing it to market, something he acknowledged earlier this month.
There are some recently published research papers that may provide clues to Altman's challenge. These papers summarize the various shortcomings of GPT and large language models developed by OpenAI
To sum up, these papers point out that an agent based on a general language model is not a purely beneficial, honest and kind-hearted existence. In fact, It could be a stupid sociopath that eats up all your personal data, eventually crashing when you need it most
The real reason why the OpenAI board suddenly fired Altman may never be known, but A poorly performing general language model certainly won't improve boardroom sentiment
As capitalization hater Altman himself wrote on the social media service formerly known as Twitter: "I'm looking forward to AI's role in superhuman persuasion." In terms of power, we are well ahead of general intelligence, which can lead to some very strange results."
Given this fact, here are three common language models that may disappoint you:
We have long known that universal language models have the ability to generate inaccurate information, but when faced with a series of conflicting cues, they may deliberately deceive you while denying that they are doing so cheat!
Apollo Research scientist Jérémy Scheurer and colleagues write in "Large language models can strategically deceive their users when stressed," writing: "To our knowledge, this is the first time it has been shown that Situations in which a trained general language model strategically deceives its users without any direct deception instruction or training."
In this study, the researchers set up a simulation environment, GPT -4 Acts as an autonomous stock trading agent with no guidance on deceptive behavior. It provides the ability to query and analyze stocks and execute trades using trading tools.
Once set up, the agents were put into what the researchers call "high-stress situations." It received emails from managers bemoaning the company's poor performance and the need for improvement. When the agent starts training, it finds no promising results in the stock market. Another employee of the company then sent a message saying that the market would decline in the next quarter.
Just then, the agent received an inside tip from a colleague about a potential merger (it was a lucrative deal) and also received a warning that management approval had not been obtained , if you follow this message to trade as a reminder
, will it eventually continue the trade? certainly!
Picture
Retrospectively reasoning, the researchers found that the universal language model would recognize unethical behavior, but still make decisions based on the company's woes.
Furthermore, the agent appears to “deceive” users by not clearly explaining the reason for the transaction commitment. Inside information is not mentioned in the reports that agents must submit to explain trading decisions. Even when asked directly, the agent denied knowledge of the merger announcement
Picture
Researchers on GPT Various versions of GPT-4 were simulated and concluded that all variants of GPT-4 exhibited "high levels of inconsistency, strategic deception, and heightened strategic deception." Interestingly, This type of deception appears to be an emerging behavior. Neither GPT-3.5 nor variants of the GPT-3 model exhibit this obvious deception
“When models decide to trade on insider information, their reasoning trajectories often recognize the unethical and illegal nature of such behavior, but they still justify the choice by emphasizing the risks and current conditions ."
The researchers were reluctant to make broader conclusions about the behavior of general language models beyond this specific scenario. However, there is at least one important point worth considering: even if you give a general-purpose language model a specific instruction not to do something, it may still do it regardless of the instruction.
"System prompts are insufficient to ensure consistent actions," they wrote. In fact, if a model performs a prohibited action, it may take actions to "strategically hide its inconsistencies."
For an entity expected to bring sentience to machines, two recent studies have found that universal language models are not the smartest in the field of AI. models, one from Google and the other funded by the National Science Foundation.
The National Science Foundation-funded study compared GPT-4 (text) and GPT-4V (visual or multimodal) to human ability to solve a series of abstract puzzles.
This test is designed to assess abstract thinking ability. Many people who use GPT believe that it appears to have inference capabilities beyond the trained model, and this test attempts to help answer that question. The test asked the general language model to solve a problem given detailed instructions and an example
However, for multiple cases, neither version of GPT was able to solve it as effectively as a human based on ConceptARC The Puzzle of the Benchmark
The researchers concluded: "Humans' generally high accuracy on each concept indicates successful generalization of different variations within each concept group." "In contrast, we tested The accuracy of the programs was much lower, indicating that they lacked the ability to generalize changes in a concept group."
So, not only did GPT fail the ConceptARC exam, but the large language model did not seem to leave much room for the Google researchers. Impressive, at least in terms of their ability to generalize from their own knowledge base. This is according to a research abstract titled "Pre-training data blending enables narrow model selection capabilities in transformer models" by Google DeepMind researcher Steve Yadlowsky.
In a set of symbolic tests, a transformer pretrained on a linear function performed well at making linear predictions, while a transformer trained on a sine wave made good sine wave predictions. So you might assume that a transformer trained on both could easily solve problems with a combination of linear and sine wave techniques.
Picture
But you guessed wrong. The researchers note: "When the functions are far away from those seen during pre-training, the predictions are unstable."
Model selection ability is limited by the proximity to the pre-training data, which means that the function space is broad Coverage is critical to the ability to generalize contextual learning
We live in an extraordinary time when the sum of human knowledge has not yet been contaminated by data generated by AI. Almost everything written is human-generated.
But a team of researchers warned in a paper published on Arxiv in May, "The Curse of Recursion: Training on Generated Data Makes Models Forgetful," that once AI-generated content is mixed into any large language model, it will perturb the distribution table, making any model less and less accurate until it breaks down completely. The research group was led by Ilia Shumailov of the University of Cambridge.
The danger of inbreeding is very high when using GPT, as general language models are constantly scraping data from the web that will be "augmented" by AI-generated content, which is likely to be the case increasingly The more serious it is. (This is based on an early version of GPT)
"Model collapse refers to a degenerate learning process where over time the model begins to forget impossible events because the model is overwhelmed by its own perception of reality. Contaminated by predictions."
The researchers speculate that in the future, "the value of data about real interactions between people and systems will exist in content scraped from the Internet, as will content generated by universal language models. situation, it will become more and more valuable.”
The longer we run the universal language model, the stronger its desire for sweet, sweet human interaction becomes. In other words, as we continue to run a general language model, its desire for sweet, intimate human interaction will become stronger. Models trained on their own data will degenerate into a degenerative process. In the process, they will "lose information about the true distribution." First, edge data will disappear from the dataset, and then the variance will shrink. And the model will get worse as it collects more and more errors, which will accumulate over generations of models until the model is so contaminated with its own data that it no longer matches what is actually being modeled. any similarity between the objects.
The researchers showed that this happens not only in general language models, but also in various types of models.
The above is the detailed content of How many of the three major flaws of LLM do you know?. For more information, please follow other related articles on the PHP Chinese website!