search
HomeTechnology peripheralsAIChatGPT Special Topic: The Capabilities and Future of Large Language Models

1. Commercialization of Generative Models

Nowadays, the generative AI track is hot. According to PitchBook statistics, the generative AI track will receive a total of approximately US$1.4 billion in financing in 2022, almost reaching the total of the past five years. Star companies such as OpenAI and Stability AI, and other start-ups such as Jasper, Regie.AI, Replika, etc. have all received capital favor.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Chart of the relationship between financing amount and time

In October 2022, Stability AI received approximately US$100 million in financing and released the open source model Stable Diffusion, which can be based on The text description input by the user generates pictures, detonating the field of AI painting. On November 30, 2022, after ChatGPT announced its public beta, five days after it went online, the number of global users exceeded one million. In less than 40 days since its launch, daily active users have exceeded 10 million. In the early morning of March 15, 2023, OpenAI released the most powerful GPT series model - GPT-4, which provides a large-scale multi-modal model that can accept image and text input and produce text output, which has a disruptive impact in the industry. . On March 17, 2023, Microsoft held the Microsoft 365 Copilot conference, officially installed OpenAI's GPT-4 model into the Office suite, and launched the new AI function Copilot. It can not only make PPT and write copy, but also perform analysis and generate videos. In addition, various major domestic manufacturers have also announced the launch of products similar to ChatGPT. On February 8, Alibaba experts broke the news that Damo Academy is developing a ChatGPT-like conversational robot and has opened it to employees within the company for testing. It is possible to deeply combine AI large model technology with DingTalk productivity tools. On February 8, He Xiaodong, Vice President of JD.com, said frankly: JD.com has rich scenarios and high-quality data in the field of ChatGPT. On February 9, relevant sources at Tencent said: Tencent currently has plans for products similar to ChatGPT and AI-generated content, and special research is also progressing in an orderly manner. NetEase said that its education business will integrate AI-generated content, including but not limited to AI speaking teachers, essay scoring and evaluation, etc. On March 16, Baidu officially released the large language model and generative AI product "Wen Xin Yi Yan". Two days after the release, 12 companies have completed the first batch of contract cooperation and applied for Baidu Intelligent Cloud Wen Xin Yi Yan API calling service. The number of companies tested reached 90,000.

At present, large models have gradually penetrated into our lives. In the future, all walks of life are likely to undergo earth-shaking changes. Taking ChatGPT as an example, it includes the following aspects:

  • ChatGPT Media: It can realize intelligent news writing and improve the effectiveness of news;
  • ChatGPT Film and Television: Customize film and television content according to public interests, Obtaining higher ratings, box office and word-of-mouth reduces the cost of content creation for film and television production teams and improves creative efficiency.
  • ChatGPT Marketing: Act as a virtual customer service to assist product marketing. For example, 24-hour product introduction and online services reduce marketing costs; can quickly understand customer needs and keep up with technological trends; provide stable and reliable consulting services with strong controllability and security.
  • ChatGPT Entertainment: Real-time chat objects, enhancing companionship and fun.
  • ChatGPT Education: Provides new educational tools to quickly check and fill in gaps through self-service questions.
  • ChatGPT Finance: Realize financial information, automated production of financial products, and create virtual financial advisors.
  • ChatGPT Medical: Quickly understand the patient’s condition and give timely feedback, providing immediate emotional support.

It should be noted that although the main discussion here is the implementation of large language models, in fact, other large models in multiple modalities (audio, video, pictures) also have broad application scenarios.

2. Introduction to generative models

1. The mainstream large language model: LaMDA

is released by Google. The LaMDA model is based on the transformer framework, has 137 billion model parameters, and has the ability to model long-distance dependencies in text. The model is trained through conversations. It mainly includes two processes: pre-training and fine-tuning: In the pre-training stage, they used up to 1.56T of public conversation data sets and web page text, using the language model (LM) as the objective function of training, that is, the goal is to predict the next character (token). In the fine-tuning phase, they designed multiple tasks, such as scoring attributes of responses (sensitivity, safety, etc.), to give the language model its human preferences. The figure below shows one type of fine-tuning task.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

LaMDA model pre-training phase

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

One of the tasks in the LaMDA model fine-tuning phase

LaMDA model Focuses on dialogue generation tasks but often makes factual errors. Google released Bard (an experimental conversational AI service) this year, which is powered by the LaMDA model. However, during Bard's press conference, Bard made factual errors, which caused Google's stock price to plummet on Wednesday, falling more than 8% intraday, as low as about $98 on the refresh day, and its market value evaporated by $110 billion, which is disappointing.

2. Mainstream large language model: InstructGPT

The InstructGPT model is based on the GPT architecture and mainly consists of supervised fine-tuning (Supervise Fune-Tuning, SFT) and human feedback reinforcement learning (Reinforce Learning Human Fune- tuning, RLHF). ChatGPT, a conversational product powered by InstructGPT, focuses on generating language text and can also generate code and perform simple mathematical operations. The specific technical details have been discussed in detail in the previous two issues. Readers can go there to read them and will not repeat them here.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

InstructGPT model training flow chart

3. Mainstream large language model: Cluade

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Cluade model training flow chart

Cluade is a conversational product of Anthropic Company. Cluade, like ChatGPT, is based on the GPT framework and is a one-way language model. However, unlike ChatGPT, it is mainly trained by reinforcement learning with supervised fine-tuning and AI feedback. In the supervised fine-tuning stage, it first formulates a series of rules (Constitution), such as not generating harmful information, not generating racial bias, etc., and then obtains supervised data based on these rules. Then, let AI judge the quality of the responses and automatically train the data set for reinforcement learning.

Compared with ChatGPT, Claude can reject inappropriate requests more clearly, and the connections between sentences are more natural. Claude is willing to speak up when faced with a problem that is beyond his capabilities. Currently, Cluade is still in the internal testing stage. However, according to the internal test results of Scale Sepllbook team members, compared to ChatGPT, Claude is stronger in 8 of the 12 tasks tested.

3. Capabilities of large language models

We have statistics on large language models at home and abroad, as well as model capabilities, open source situations, etc.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Domestic popular large language models

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Foreign popular large language models

You can see It turns out that large language models have a variety of capabilities, including but not limited to few-shot learning, zero-shot transfer, and so on. So a very natural question arises, how do these abilities come about? Where does the power of large language models come from? Next, we try to answer the above doubts.

The figure below shows some mature large language models and evolution processes. To sum up, most models will go through three stages: pre-training, instruction fine-tuning and alignment. Representative models include Deepmind’s Sparrow and OpenAI’s ChatGPT.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Evolutionary diagram of popular large language models

So, behind each step, what kind of capabilities can the model achieve? Dr. Fu Yao from the University of Edinburgh summarized what he believed to be the corresponding relationship between steps and abilities, giving us some inspiration.

1. Pre-training phase. The goal of this phase is to obtain a powerful basic model. Correspondingly, the capabilities demonstrated by the model at this stage include: language generation, context learning capabilities, world knowledge, reasoning capabilities, etc. Representative models at this stage include GPT-3, PaLM, etc.

2. Instruction fine-tuning stage. The goal of this phase is to unlock some emergent abilities. The emergent ability here specifically refers to the ability that small models do not have but only large models have. The model that has undergone instruction fine-tuning has capabilities that the basic model does not have. For example, by constructing new instructions, the model can solve new tasks; another example is the ability of the thinking chain, that is, by showing the model the reasoning process, the model can also imitate the correct reasoning, etc. Representative models include InstructGPT, Flan, etc.

Alignment stage. The goal of this stage is to make the model possess human values, such as to generate informative replies and not to produce discriminatory remarks, etc. It can be thought that the alignment stage gives the models “personality”. The representative model of this type is ChatGPT.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

Three stages of large language model. The picture comes from "Fu Yao: On the Source of the Ability of Large Language Models"

Generally speaking, the above three stages complement each other and are indispensable. Only when a sufficiently powerful basic model is obtained in the pre-training stage can it be possible to stimulate (or enhance) other capabilities of the language model through instruction fine-tuning. The alignment stage gives the model a certain "character" to better comply with some requirements of human society.

4. Generative model identification

While large language model technology brings convenience, it also contains risks and challenges. At a technical level, the authenticity of the content generated by GPT cannot be guaranteed, such as harmful remarks, etc. At the usage level, users may abuse AI-generated texts in fields such as education and scientific research. Currently, many companies and institutions have begun to impose restrictions on the use of ChatGPT. Microsoft and Amazon have banned company employees from sharing sensitive data to ChatGPT for fear of leaking confidential information; the University of Hong Kong has banned the use of ChatGPT or other artificial intelligence tools in all classes, assignments and assessments at the University of Hong Kong. We mainly introduce related work in industry.

GPTZero: GPTZero is the earliest text generation and identification tool. It is an online website (https://gptzero.me/) published by Edward Tian (a CS undergraduate student from Princeton, USA). Its principle relies on text perplexity (PPL) as an indicator to determine who wrote the given content. Among them, perplexity is used to evaluate the quality of the language model, which is essentially to calculate the probability of a sentence appearing.

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

GPTZero website interface

(Here we use ChatGPT to generate a news report and let GPTZero determine whether it is generated text.)

GPT2 Output Detector: This tool is published by OpenAI. It leverages the "GPT2-Generated Content" and Reddit datasets, fine-tuned on RoBerta, to learn a detection classifier. That is, "fight magic with magic." The official website also reminds that the prediction results are more credible only when the text exceeds 50 characters (token).

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

GPT2 Output Detector website interface

AI Text Classifier: This tool is published by OpenAI. The principle is to collect human writing texts and AI writing texts on the same topic. Divide each text into prompt and reply pairs, and let the probability of GPT producing an answer after fine-tuning (for example, letting GPT produce Yes/No) as the result threshold. The tool's classification is very detailed, and the results include 5 categories: very unlikely to be generated by AI (threshold 0.98).

ChatGPT Special Topic: The Capabilities and Future of Large Language Models

AI Text Classifier website interface

5. Summary & Outlook

Large language models have emergent capabilities that small models do not have, such as excellent Zero-sample learning, domain transfer, and thinking chain capabilities. The power of large models actually comes from pre-training, instruction fine-tuning and alignment. These three processes are closely related and have made today's super powerful large language models possible.

The large language model (GPT series) currently does not have the capabilities of confidence update, formal reasoning, Internet retrieval, etc. Some experts believe that if knowledge can be offloaded outside the model, the number of parameters will be greatly reduced, and the large language model will be greatly reduced. Models can really go a step further.

Only under reasonable supervision and governance, artificial intelligence technology can better serve people. There is a long way to go to develop large-scale models in China!

References

[1] https://stablediffusionweb.com

[2] https://openai.com/product/gpt-4

[3] LaMDA: Language Models for Dialog Applications, Arxiv 2022.10

[4] Constitutional AI: Harmlessness from AI Feedback, Arxiv 2022.12

[5] https://scale.com /blog/chatgpt-vs-claude#Calculation

[6] Guolian Securities: "ChatGPT has arrived, and commercialization is accelerating"

[7] Guotai Junan Securities: "ChatGPT Research Framework 2023》

[8] Fu Yao: Pre-training, instruction fine-tuning, alignment, specialization: On the source of large language model capabilities https://www.bilibili.com/video/BV1Qs4y1h7pn/?spm_id_from=333.880 .my_history.page.click&vd_source=da8bf0b993cab65c4de0f26405823475

[9] Analysis of a 10,000-word long article! Reproduce and use GPT-3/ChatGPT, what you should know https://mp.weixin.qq.com/s/ILpbRRNP10Ef1z3lb2CqmA

The above is the detailed content of ChatGPT Special Topic: The Capabilities and Future of Large Language Models. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Tesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserTesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserApr 22, 2025 am 11:48 AM

Since 2008, I've championed the shared-ride van—initially dubbed the "robotjitney," later the "vansit"—as the future of urban transportation. I foresee these vehicles as the 21st century's next-generation transit solution, surpas

Sam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailSam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailApr 22, 2025 am 11:29 AM

Revolutionizing the Checkout Experience Sam's Club's innovative "Just Go" system builds on its existing AI-powered "Scan & Go" technology, allowing members to scan purchases via the Sam's Club app during their shopping trip.

Nvidia's AI Omniverse Expands At GTC 2025Nvidia's AI Omniverse Expands At GTC 2025Apr 22, 2025 am 11:28 AM

Nvidia's Enhanced Predictability and New Product Lineup at GTC 2025 Nvidia, a key player in AI infrastructure, is focusing on increased predictability for its clients. This involves consistent product delivery, meeting performance expectations, and

Exploring the Capabilities of Google's Gemma 2 ModelsExploring the Capabilities of Google's Gemma 2 ModelsApr 22, 2025 am 11:26 AM

Google's Gemma 2: A Powerful, Efficient Language Model Google's Gemma family of language models, celebrated for efficiency and performance, has expanded with the arrival of Gemma 2. This latest release comprises two models: a 27-billion parameter ver

The Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaThe Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaApr 22, 2025 am 11:21 AM

This Leading with Data episode features Dr. Kirk Borne, a leading data scientist, astrophysicist, and TEDx speaker. A renowned expert in big data, AI, and machine learning, Dr. Borne offers invaluable insights into the current state and future traje

AI For Runners And Athletes: We're Making Excellent ProgressAI For Runners And Athletes: We're Making Excellent ProgressApr 22, 2025 am 11:12 AM

There were some very insightful perspectives in this speech—background information about engineering that showed us why artificial intelligence is so good at supporting people’s physical exercise. I will outline a core idea from each contributor’s perspective to demonstrate three design aspects that are an important part of our exploration of the application of artificial intelligence in sports. Edge devices and raw personal data This idea about artificial intelligence actually contains two components—one related to where we place large language models and the other is related to the differences between our human language and the language that our vital signs “express” when measured in real time. Alexander Amini knows a lot about running and tennis, but he still

Jamie Engstrom On Technology, Talent And Transformation At CaterpillarJamie Engstrom On Technology, Talent And Transformation At CaterpillarApr 22, 2025 am 11:10 AM

Caterpillar's Chief Information Officer and Senior Vice President of IT, Jamie Engstrom, leads a global team of over 2,200 IT professionals across 28 countries. With 26 years at Caterpillar, including four and a half years in her current role, Engst

New Google Photos Update Makes Any Photo Pop With Ultra HDR QualityNew Google Photos Update Makes Any Photo Pop With Ultra HDR QualityApr 22, 2025 am 11:09 AM

Google Photos' New Ultra HDR Tool: A Quick Guide Enhance your photos with Google Photos' new Ultra HDR tool, transforming standard images into vibrant, high-dynamic-range masterpieces. Ideal for social media, this tool boosts the impact of any photo,

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.