What improvement does GPT-4 have over ChatGPT? Jen-Hsun Huang held a 'fireside chat' with OpenAI co-founder-AI-php.cn

Home

Technology peripherals

What improvement does GPT-4 have over ChatGPT? Jen-Hsun Huang held a 'fireside chat' with OpenAI co-founder

PHPz

Mar 31, 2023 pm 10:39 PM

chatgptgpt-4Neural Networks

The most important difference between ChatGPT and GPT-4 is that building on GPT-4 predicts the next character with higher accuracy. The better a neural network can predict the next word in a text, the better it can understand the text.

produced by Big Data Digest

Author: Caleb

What kind of sparks will Nvidia create when it encounters OpenAI?

Just now, Nvidia founder and CEO Huang Jensen had an in-depth exchange with OpenAI co-founder Ilya Sutskever during a GTC fireside chat.

What improvement does GPT-4 have over ChatGPT? Jen-Hsun Huang held a fireside chat with OpenAI co-founder

Video link:

https://www.nvidia.cn/gtc-global/session-catalog/?tab.catalogallsessinotallow=16566177511100015Kus #/session/1669748941314001t6Nv

Two days ago, OpenAI launched the most powerful artificial intelligence model to date, GPT-4. OpenAI calls GPT-4 "OpenAI's most advanced system" on its official website and "can produce safer and more useful responses."

Sutskever also said during the talk that GPT-4 marks "considerable improvements" in many aspects compared to ChatGPT, noting that the new model can read images and text. "In some future version, [users] may get a chart" in response to questions and inquiries, he said.

There is no doubt that with the popularity of ChatGPT and GPT-4 on a global scale, this has also become the focus of this conversation. In addition to GPT-4 and its predecessors including ChatGPT related topics, Huang Renxun and Sutskever also talked about the capabilities, limitations and internal operations of deep neural networks, as well as predictions for future AI development.

Let’s take a closer look at this conversation with Digest Fungus~

Start when no one cares about network scale and computing scale

There may be many people When I hear Sutskever's name, the first thing that comes to mind is OpenAI and its related AI products, but you must know that Sutskever's resume can be traced back to Andrew Ng's postdoc, Google Brain research scientist, and co-developer of the Seq2Seq model.

It can be said that from the beginning, deep learning has been bound to Sutskever.

When talking about his understanding of deep learning, Sutskever said that from now on, deep learning has indeed changed the world. However, his personal starting point lies more in his intuition about the huge impact potential of AI, his strong interest in consciousness and human experience, and his belief that the development of AI will help answer these questions.

During 2002-03, people generally believed that learning was something that only humans could do, and that computers could not learn. And if computers can be given the ability to learn, it will be a major breakthrough in the field of AI.

This also became an opportunity for Sutskever to officially enter the AI field.

So Sutskever found Jeff Hinton from the same university. In his view, the neural network Hinton is working on is the breakthrough, because the characteristics of neural networks lie in parallel computers that can learn and be programmed automatically.

At that time, no one cared about the importance of network scale and calculation scale. People trained only 50 or 100 neural networks, and hundreds of them were already considered large, with one million parameters. Also considered huge.

In addition, they can only run programs on unoptimized CPU code, because no one understands BLAS. They use optimized Matlab to do some experiments, such as what kind of questions to use to ask and compare. good.

But the problem is that these are very scattered experiments and cannot really promote technological progress.

Building a neural network for computer vision

At that time, Sutskever realized that supervised learning was the way forward in the future.

This is not only an intuition, but also an undisputed fact. If the neural network is deep enough and large enough, it will have the ability to solve some difficult tasks. But people have not yet focused on deep and large neural networks, or even focused on neural networks at all.

In order to find a good solution, a suitably large data set and a lot of calculations are needed.

ImageNet is that data. At that time, ImageNet was a very difficult data set, but to train a large convolutional neural network, you must have matching computing power.

Next it’s time for the GPU to appear. Under the suggestion of Jeff Hinton, they found that with the emergence of the ImageNet data set, the convolutional neural network is a very suitable model for GPU, so it can be made very fast and the scale is getting larger and larger.

Subsequently, it directly and significantly broke the record of computer vision. This is not based on the continuation of previous methods. The key lies in the difficulty and scope of the data set itself.

OpenAI: From 100 People to ChatGPT

In the early days of OpenAI, Sutskever admitted that they were not entirely sure how to promote the project.

At the beginning of 2016, neural networks were not as developed and there were many fewer researchers than there are now. Sutskever recalled that there were only 100 people in the company at the time, and most of them were still working at Google or deepmind.

But they had two big ideas at the time.

One of them is unsupervised learning through compression. In 2016, unsupervised learning was an unsolved problem in machine learning, and no one knew how to implement it. Compression has not been a topic that people usually talk about recently, but suddenly everyone realized that GPT actually compresses the training data.

Mathematically speaking, training these autoregressive generative models compresses the data, and intuitively you can see why it works. If the data is compressed well enough, you can extract all the hidden information present in it. This also directly led to OpenAI’s related research on emotional neurons.

At the same time, when they tuned the same LSTM to predict the next character of an Amazon review, they found that if you predict the next character well enough, there will be a neuron within the LSTM that corresponds to its sentiment. This is a good demonstration of the effect of unsupervised learning and also verifies the idea of next character prediction.

But where do we get the data for unsupervised learning? The hard part about unsupervised learning, Sutskever said, is less about the data and more about why you're doing it, and realizing that training a neural network to predict the next character is worth pursuing and exploring. From there it learns an understandable representation.

Another big idea is reinforcement learning. Sutskever has always believed that bigger is better. At OpenAI, one of their goals is to figure out the right way to scale.

The first really big project OpenAI completed was the implementation of the strategy game Dota 2. At that time, OpenAI trained a reinforcement learning agent to fight against itself. The goal was to reach a certain level and be able to play games with human players.

The transformation from Dota's reinforcement learning to the reinforcement learning of human feedback combined with the GPT output technology base has become today's ChatGPT.

How OpenAI trains a large neural network

When training a large neural network to accurately predict the next word in different texts on the Internet, what OpenAI does is learn a world Model.

This looks like we are only learning statistical correlations in text, but in fact, learning these statistical correlations can compress this knowledge very well. What the neural network learns are some expressions in the process of generating text. This text is actually a map of the world, so the neural network can learn more and more perspectives to view humans and society. These are what the neural network really learns in the task of accurately predicting the next word.

At the same time, the more accurate the prediction of the next word, the higher the degree of restoration, and the higher the resolution of the world obtained in this process. This is the role of the pre-training phase, but it does not make the neural network behave the way we want it to behave.

What a language model is really trying to do is, if I had some random text on the internet, starting with some prefix or hint, what would it complete.

Of course it can also find text to fill in on the Internet, but this is not what was originally conceived, so additional training is required, which is fine-tuning, reinforcement learning from human teachers, and other forms of Where AI assistance can help.

But this is not about teaching new knowledge, but about communicating with it and conveying to it what we want it to be, which also includes boundaries. The better this process is done, the more useful and reliable the neural network will be, and the higher the fidelity of the boundaries will be.

Let’s talk about GPT-4 again

Not long after ChatGPT became the application with the fastest growing users, GPT-4 was officially released.

When talking about the differences between the two, Sutskever said that GPT-4 has achieved considerable improvements in many dimensions compared to ChatGPT.

For example, you read a detective novel. The plot is very complex, with many plots and characters interspersed, and many mysterious clues buried. In the last chapter of the book, the detective collected all the clues, called everyone together, and said that now he will reveal who the culprit is, and that person is...

This is what GPT-4 can predict.

People say that deep learning cannot reason logically. But whether it is this example or some of the things that GPT can do, it shows a certain degree of reasoning ability.

Sutskever responded that when we are defining logical reasoning, if you can think about it in a certain way when making the next decision, you may be able to get a better answer. It remains to be seen how far neural networks can go, and OpenAI has not yet fully tapped its potential.

Some neural networks actually already have this kind of ability, but most of them are not reliable enough. Reliability is the biggest obstacle to making these models useful, and it is also a major bottleneck of current models. It’s not about whether the model has a specific capability, but how much capability it has.

Sutskever also said that GPT-4 did not have a built-in search function when it was released. It is just a good tool that can predict the next word, but it can be said that it fully has this ability and will make search more efficient. good.

Another significant improvement in GPT-4 is the response and processing of images. Multimodal learning plays an important role in it. Sutskever said that multimodality has two dimensions. The first is that multimodality is useful for neural networks, especially vision; the second is that in addition to text learning In addition, knowledge about the world can also be learned from images.

The future of artificial intelligence

When it comes to using AI to train AI, Sutskever said this part of the data should not be ignored.

It is difficult to predict the development of language models in the future, but in Sutskever’s view, there is good reason to believe that this field will continue to progress, and AI will continue to shock mankind with its strength at the boundaries of its capabilities. The reliability of AI is determined by whether it can be trusted, and it will definitely reach a point where it can be completely trusted in the future.

If it doesn’t fully understand, it will also ask questions to figure it out, or tell you that it doesn’t know. These are the areas where AI usability has the greatest impact and will see the greatest progress in the future.

Now we are faced with such a challenge, you want a neural network to summarize a long document or obtain a summary, how to ensure that important details have not been overlooked? If a point is obviously important enough that every reader will agree on it, then the content summarized by the neural network can be accepted as reliable.

The same applies to whether the neural network clearly follows the user's intent.

We will see more and more of this technology in the next two years, making this technology more and more reliable.

Related reports:https://blogs.nvidia.com/blog/2023/03/22/sutskever-openai-gtc/

The above is the detailed content of What improvement does GPT-4 have over ChatGPT? Jen-Hsun Huang held a 'fireside chat' with OpenAI co-founder. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51cto. If there is any infringement, please contact admin@php.cn delete

AI Therapists Are Here: 14 Groundbreaking Mental Health Tools You Need To KnowApr 30, 2025 am 11:17 AM

While it can’t provide the human connection and intuition of a trained therapist, research has shown that many people are comfortable sharing their worries and concerns with relatively faceless and anonymous AI bots. Whether this is always a good i

Calling AI To The Grocery AisleApr 30, 2025 am 11:16 AM

Artificial intelligence (AI), a technology decades in the making, is revolutionizing the food retail industry. From large-scale efficiency gains and cost reductions to streamlined processes across various business functions, AI's impact is undeniabl

Getting Pep Talks From Generative AI To Lift Your SpiritApr 30, 2025 am 11:15 AM

Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI including identifying and explaining various impactful AI complexities (see the link here). In addition, for my comp

Why AI-Powered Hyper-Personalization Is A Must For All BusinessesApr 30, 2025 am 11:14 AM

Maintaining a professional image requires occasional wardrobe updates. While online shopping is convenient, it lacks the certainty of in-person try-ons. My solution? AI-powered personalization. I envision an AI assistant curating clothing selecti

Forget Duolingo: Google Translate's New AI Feature Teaches LanguagesApr 30, 2025 am 11:13 AM

Google Translate adds language learning function According to Android Authority, app expert AssembleDebug has found that the latest version of the Google Translate app contains a new "practice" mode of testing code designed to help users improve their language skills through personalized activities. This feature is currently invisible to users, but AssembleDebug is able to partially activate it and view some of its new user interface elements. When activated, the feature adds a new Graduation Cap icon at the bottom of the screen marked with a "Beta" badge indicating that the "Practice" feature will be released initially in experimental form. The related pop-up prompt shows "Practice the activities tailored for you!", which means Google will generate customized

They're Making TCP/IP For AI, And It's Called NANDAApr 30, 2025 am 11:12 AM

MIT researchers are developing NANDA, a groundbreaking web protocol designed for AI agents. Short for Networked Agents and Decentralized AI, NANDA builds upon Anthropic's Model Context Protocol (MCP) by adding internet capabilities, enabling AI agen

The Prompt: Deepfake Detection Is A Booming BusinessApr 30, 2025 am 11:11 AM

Meta's Latest Venture: An AI App to Rival ChatGPT Meta, the parent company of Facebook, Instagram, WhatsApp, and Threads, is launching a new AI-powered application. This standalone app, Meta AI, aims to compete directly with OpenAI's ChatGPT. Lever

The Next Two Years In AI Cybersecurity For Business LeadersApr 30, 2025 am 11:10 AM

Navigating the Rising Tide of AI Cyber Attacks Recently, Jason Clinton, CISO for Anthropic, underscored the emerging risks tied to non-human identities—as machine-to-machine communication proliferates, safeguarding these "identities" become

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

4 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

2 weeks agoByDDD

Roblox: Dead Rails – How To Summon And Defeat Nikola Tesla

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Notepad++7.3.1

Easy-to-use and free code editor

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Hot Topics

Where is the login entrance for gmail email?

7861

1649

1404

1300

1242