search
HomeTechnology peripheralsAINew title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

According to news on November 14, Nvidia officially released the new H200 GPU at the "Supercomputing 23" conference on the morning of the 13th local time, and updated the GH200 product line

Among them, H200 is still built on the existing Hopper H100 architecture, but adds more high-bandwidth memory (HBM3e) to better handle the large data sets required to develop and implement artificial intelligence, making it easier to run The overall performance of the large model is improved by 60% to 90% compared to the previous generation H100. The updated GH200 will also power the next generation of AI supercomputers. More than 200 exaflops of AI computing power will come online in 2024.

H200: HBM capacity increased by 76%, performance of large models improved by 90%

Specifically, the new H200 offers up to 141GB of HBM3e memory, effectively running at approximately 6.25 Gbps, for a total bandwidth of 4.8 TB/s per GPU in the six HBM3e stacks. This is a huge improvement compared to the previous generation H100 (with 80GB HBM3 and 3.35 TB/s bandwidth), with more than 76% increase in HBM capacity. According to official data, when running large models, H200 will bring an improvement of 60% (GPT3 175B) to 90% (Llama 2 70B) compared to H100

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

While some configurations of the H100 do offer more memory, such as the H100 NVL which pairs the two boards and offers a total of 188GB of memory (94GB per GPU), even compared to the H100 SXM variant, the new The H200 SXM also offers 76% more memory capacity and 43% more bandwidth.

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

It should be pointed out that the raw computing performance of H200 does not seem to have changed much. The only slide Nvidia showed that reflected compute performance was based on an HGX 200 configuration using eight GPUs, with a total performance of "32 PFLOPS FP8." While the original H100 provided 3,958 teraflops of FP8 computing power, eight such GPUs also provide approximately 32 PFLOPS of FP8 computing power

The improvement brought by higher bandwidth memory depends on the workload. Large models (such as GPT-3) will greatly benefit from the increase in HBM memory capacity. According to Nvidia, the H200 will perform 18 times better than the original A100 when running GPT-3 and approximately 11 times faster than the H100. Additionally, a teaser for the upcoming Blackwell B100 shows that it contains a taller bar that fades to black, roughly twice as long as the H200's.

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

Not only that, H200 and H100 are compatible with each other. In other words, AI companies that use the H100 training/inference model can seamlessly switch to the latest H200 chip. Cloud service providers do not need to make any modifications when adding H200 to their product portfolio.

Nvidia said that by launching new products, they hope to keep up with the growth in the size of data sets used to create artificial intelligence models and services. The enhanced memory capabilities will make the H200 faster in the process of feeding data to software, a process that helps train artificial intelligence to perform tasks such as recognizing images and speech.

"Integrating faster and larger HBM memory can help improve performance for computationally demanding tasks, including generative AI models and high-performance computing applications, while optimizing GPU usage and efficiency." NVIDIA High said Ian Buck, vice president of performance computing products.

Dion Harris, head of data center products at Nvidia, said: “When you look at the trends in the market, model sizes are increasing rapidly. This is our continued introduction of the latest and greatest technology. Model.”

Mainframe computer manufacturers and cloud service providers are expected to begin using the H200 in the second quarter of 2024. NVIDIA server manufacturing partners (including Evergreen, ASUS, Dell, Eviden, Gigabyte, HPE, Hongbai, Lenovo, Wenda, MetaVision, Wistron and Wiwing) can use the H200 to update existing systems, while Amazon , Google, Microsoft, Oracle, etc. will become the first cloud service providers to adopt H200.

Given the current strong market demand for Nvidia AI chips and the new H200 adding more expensive HBM3e memory, the price of H200 will definitely be more expensive. Nvidia doesn't list a price, but the previous generation H100 cost $25,000 to $40,000.

Nvidia spokesperson Kristin Uchiyama said final pricing will be determined by Nvidia’s manufacturing partners

Regarding whether the launch of H200 will affect the production of H100, Kristin Uchiyama said: "We expect the total supply throughout the year to increase."

Nvidia’s high-end AI chips have always been regarded as the best choice for processing large amounts of data and training large language models and AI generation tools. However, when the H200 chip was launched, AI companies were still desperately seeking A100/H100 chips in the market. The market's focus remains on whether Nvidia can provide enough supply to meet market demand. Therefore, NVIDIA has not given an answer to whether H200 chips will be in short supply like H100 chips

However, next year may be a more favorable period for GPU buyers. According to a report in the Financial Times in August, NVIDIA plans to triple H100 production in 2024, and the production target will increase from approximately 2023 to approximately 500,000 to 2 million in 2024. But generative AI is still booming, and demand is likely to be greater in the future.

For example, the newly launched GPT-4 is trained on approximately 10,000-25,000 blocks of A100. Meta’s large AI model requires approximately 21,000 A100 blocks for training. Stability AI uses approximately 5,000 A100s. Falcon-40B training requires 384 A100

According to Musk, GPT-5 may require 30,000-50,000 H100. Morgan Stanley's quote is 25,000 GPUs.

Sam Altman denied training GPT-5, but mentioned that "OpenAI has a serious shortage of GPUs, and the fewer people using our products, the better."

Of course, in addition to NVIDIA, AMD and Intel are also actively entering the AI ​​market to compete with NVIDIA. The MI300X previously launched by AMD is equipped with 192GB HBM3 and 5.2TB/s memory bandwidth, which will make it far exceed the H200 in terms of capacity and bandwidth.

Similarly, Intel plans to increase the HBM capacity of Gaudi AI chips. According to the latest released information, Gaudi 3 uses a 5nm process, and its performance in BF16 workloads will be four times that of Gaudi 2, and its network performance will also be twice that of Gaudi 2 (Gaudi 2 has 24 built-in 100 GbE RoCE NICs ). Additionally, Gaudi 3 has 1.5 times the HBM capacity of Gaudi 2 (Gaudi 2 has a 96 GB HBM2E). As can be seen from the figure below, Gaudi 3 uses a chiplet-based design with two computing clusters, unlike Gaudi 2 which uses Intel’s single-chip solution

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

New GH200 super chip: Powering the next generation of AI supercomputers

In addition to releasing the new H200 GPU, Nvidia also launched an upgraded version of the GH200 super chip. This chip uses NVIDIA NVLink-C2C chip interconnect technology, combining the latest H200 GPU and Grace CPU (not sure if it is an upgraded version). Each GH200 super chip will also carry a total of 624GB of memory

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

For comparison, the previous generation GH200 is based on the H100 GPU and the 72-core Grace CPU, providing 96GB of HBM3 and 512GB of LPDDR5X integrated in the same package.

Although NVIDIA did not introduce the details of the Grace CPU in the GH200 super chip, NVIDIA provided some comparisons between GH200 and "modern dual-channel x86 CPUs." It can be seen that GH200 has brought an 8-fold improvement in ICON performance, and MILC, Quantum Fourier Transform, RAG LLM Inference, etc. have brought dozens or even hundreds of times improvement.

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

But it should be pointed out that accelerated and "non-accelerated systems" are mentioned. what does that mean? We can only assume that x86 servers are running code that is not fully optimized, especially considering that the world of artificial intelligence is evolving rapidly and new advances in optimization seem to be coming out on a regular basis.

The new GH200 will also be used in the new HGX H200 system. These are said to be "seamlessly compatible" with existing HGX H100 systems, meaning HGX H200s can be used in the same installation to increase performance and memory capacity without the need to redesign the infrastructure.

According to reports, the Alpine supercomputer of the Swiss National Supercomputing Center may be one of the first batch of GH100-based Grace Hopper supercomputers put into use next year. The first GH200 system to enter service in the United States will be the Venado supercomputer at Los Alamos National Laboratory. Texas Advanced Computing Center (TACC) Vista systems will also use the just-announced Grace CPU and Grace Hopper superchips, but it's unclear whether they will be based on the H100 or H200

New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%

Currently, the largest supercomputer to be installed is the Jupiter supercomputer at the Jϋlich Supercomputing Center. It will house "nearly" 24,000 GH200 superchips, totaling 93 exaflops of AI computing (presumably using FP8, although most AI still uses BF16 or FP16). It will also deliver 1 exaflop of traditional FP64 computing. It will use a "Quad GH200" board with four GH200 superchips.

Nvidia expects these new supercomputers to be installed in the next year or so to, collectively, achieve more than 200 exaflops of artificial intelligence computing power

If the original meaning does not need to be changed, the content needs to be rewritten into Chinese and the original sentence does not need to appear

The above is the detailed content of New title: NVIDIA H200 released: HBM capacity increased by 76%, the most powerful AI chip that significantly improves large model performance by 90%. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:搜狐. If there is any infringement, please contact admin@php.cn delete
Tesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserTesla's Robovan Was The Hidden Gem In 2024's Robotaxi TeaserApr 22, 2025 am 11:48 AM

Since 2008, I've championed the shared-ride van—initially dubbed the "robotjitney," later the "vansit"—as the future of urban transportation. I foresee these vehicles as the 21st century's next-generation transit solution, surpas

Sam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailSam's Club Bets On AI To Eliminate Receipt Checks And Enhance RetailApr 22, 2025 am 11:29 AM

Revolutionizing the Checkout Experience Sam's Club's innovative "Just Go" system builds on its existing AI-powered "Scan & Go" technology, allowing members to scan purchases via the Sam's Club app during their shopping trip.

Nvidia's AI Omniverse Expands At GTC 2025Nvidia's AI Omniverse Expands At GTC 2025Apr 22, 2025 am 11:28 AM

Nvidia's Enhanced Predictability and New Product Lineup at GTC 2025 Nvidia, a key player in AI infrastructure, is focusing on increased predictability for its clients. This involves consistent product delivery, meeting performance expectations, and

Exploring the Capabilities of Google's Gemma 2 ModelsExploring the Capabilities of Google's Gemma 2 ModelsApr 22, 2025 am 11:26 AM

Google's Gemma 2: A Powerful, Efficient Language Model Google's Gemma family of language models, celebrated for efficiency and performance, has expanded with the arrival of Gemma 2. This latest release comprises two models: a 27-billion parameter ver

The Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaThe Next Wave of GenAI: Perspectives with Dr. Kirk Borne - Analytics VidhyaApr 22, 2025 am 11:21 AM

This Leading with Data episode features Dr. Kirk Borne, a leading data scientist, astrophysicist, and TEDx speaker. A renowned expert in big data, AI, and machine learning, Dr. Borne offers invaluable insights into the current state and future traje

AI For Runners And Athletes: We're Making Excellent ProgressAI For Runners And Athletes: We're Making Excellent ProgressApr 22, 2025 am 11:12 AM

There were some very insightful perspectives in this speech—background information about engineering that showed us why artificial intelligence is so good at supporting people’s physical exercise. I will outline a core idea from each contributor’s perspective to demonstrate three design aspects that are an important part of our exploration of the application of artificial intelligence in sports. Edge devices and raw personal data This idea about artificial intelligence actually contains two components—one related to where we place large language models and the other is related to the differences between our human language and the language that our vital signs “express” when measured in real time. Alexander Amini knows a lot about running and tennis, but he still

Jamie Engstrom On Technology, Talent And Transformation At CaterpillarJamie Engstrom On Technology, Talent And Transformation At CaterpillarApr 22, 2025 am 11:10 AM

Caterpillar's Chief Information Officer and Senior Vice President of IT, Jamie Engstrom, leads a global team of over 2,200 IT professionals across 28 countries. With 26 years at Caterpillar, including four and a half years in her current role, Engst

New Google Photos Update Makes Any Photo Pop With Ultra HDR QualityNew Google Photos Update Makes Any Photo Pop With Ultra HDR QualityApr 22, 2025 am 11:09 AM

Google Photos' New Ultra HDR Tool: A Quick Guide Enhance your photos with Google Photos' new Ultra HDR tool, transforming standard images into vibrant, high-dynamic-range masterpieces. Ideal for social media, this tool boosts the impact of any photo,

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor