Home > Article > Technology peripherals > OpenAI CEO says: Expanding scale is not the only way to progress, and the era of giant AI models may be coming to an end
News on April 18th, OpenAI’s chatbot ChatGPT is so powerful that it has aroused great interest and investment in artificial intelligence. However, the company’s CEO Sam Altman believes that existing research strategies have failed and future AI progress requires new ideas.
In recent years, OpenAI has made an impressive series of advances in processing language by scaling existing machine learning algorithms to previously unimaginable scales. Its most recently developed project is GPT-4, which it says has been trained using trillions of words of text and thousands of powerful computer chips at a cost of more than $100 million.
However, Altman said that future advances in AI will no longer depend on making models larger. "I think we're at the end of an era," he said at an MIT event. "In this [outgoing] era, models got bigger and bigger. Now, we're going to be in other ways." Make them better.”
Altman’s comments represent an unexpected turn in the race to develop and deploy new AI algorithms. Since launching ChatGPT in November, Microsoft has leveraged the underlying technology to add chatbots to its Bing search engine, and Google has launched a competitor called Bard. Many people are eager to try out this new chatbot to help with work or personal tasks.
Meanwhile, a number of well-funded startups, including Anthropic, AI21, Cohere, and Character.AI, are pouring resources into building larger algorithms in an effort to catch up with OpenAI. The initial version of ChatGPT is built on GPT-3, but users now also have access to a more powerful GPT-4 supported version.
Altman’s statement also hinted that after adopting the strategy of expanding the model and providing more data for training, GPT-4 may be OpenAI’s last major achievement. However, he did not reveal any research strategies or techniques that might replace current methods. In a paper describing GPT-4, OpenAI said its estimates showed diminishing returns from scaling up models. There are also physical limits to the number of data centers the company can build and how quickly it can build them, Altman said.
Cohere co-founder Nick Frosst, who worked on artificial intelligence at Google, said that what Altman calls "continuously increasing the size of the model is not an effective solution without limit." plan" is correct. He believes that machine learning models for GPT-4 and other transformers types (editing group: transformers are literally translated as converters, and GPT is the abbreviation of Generative pre-trained transformers, meaning generative pre-training models based on transformers), progress It’s not just about scaling anymore.
Frost added: "There are many ways to make transformers better and more useful, and many of them do not involve adding parameters to the model. New artificial intelligence model design or architecture, and human-based Further adjustment of feedback is a direction that many researchers are already exploring.”
In OpenAI’s language algorithm family, each version is composed of artificial neural networks. The design of this software is inspired by neural networks. The way in which elements interact with each other, after training, it can predict the words that should follow a given text string.
In 2019, OpenAI released its first language model GPT-2. It involves up to 1.5 billion parameters and is a measure of the adjustable number of connections between neurons. That's a very large number, thanks in part to a discovery by OpenAI researchers that scaling up makes the model more coherent.
In 2020, OpenAI launched GPT-3, the successor of GPT-2, which is a larger model with up to 175 billion parameters. GPT-3’s broad ability to generate poetry, emails, and other text has led other companies and research institutions to believe that they can scale their own AI models to similar or even larger scales than GPT-3.
After ChatGPT debuted in November last year, meme makers and technology experts speculated that when GPT-4 came out, it would be a more complex model with more parameters. However, when OpenAI finally announced its new AI model, the company didn't reveal how big it would be, perhaps because size was no longer the only factor that mattered. At the MIT event, Altman was asked if the cost of training GPT-4 was $100 million, and he responded: "More than that."
Although OpenAI is keeping the scale and inner workings of GPT-4 secret, it is likely that it no longer relies solely on scaling up to improve performance. One possibility is that the company used a method called "reinforcement learning with human feedback" to enhance ChatGPT's capabilities, including having humans judge the quality of the model's answers to guide it in providing services that are more likely to be judged as high quality. s answer.
GPT-4’s extraordinary capabilities have alarmed many experts and sparked debate over AI’s potential to transform the economy, as well as concerns that it could spread disinformation and create unemployment. A number of entrepreneurs and AI experts recently signed an open letter calling for a six-month moratorium on development of models more powerful than GPT-4, including Tesla CEO Elon Musk.
At the MIT event, Altman confirmed that his company is not currently developing GPT-5. He added: "An earlier version of this open letter claimed that OpenAI was training GPT-5. In fact we are not doing this and won't be in the short term."
The above is the detailed content of OpenAI CEO says: Expanding scale is not the only way to progress, and the era of giant AI models may be coming to an end. For more information, please follow other related articles on the PHP Chinese website!