Home > Article > Technology peripherals > NVIDIA H100 dominates the authoritative AI performance test, completing large model training based on GPT-3 in 11 minutes
On Tuesday local time, MLCommons, an open industry alliance in the field of machine learning and artificial intelligence, disclosed the latest data from two MLPerf benchmarks. Among them, the NVIDIA H100 chipset set new records in all categories in the test of artificial intelligence computing power performance. It is also the only hardware platform that can run all tests.
(Source: NVIDIA, MLCommons)
MLPerf is an artificial intelligence leadership alliance composed of academia, laboratories and industries. It is currently an internationally recognized and authoritative AI performance evaluation benchmark. Training v3.0 contains 8 different loads, including vision (image classification, biomedical image segmentation, object detection for two loads), language (speech recognition, large language model, natural language processing) and recommendation system. In other words, different equipment vendors take different amounts of time to complete the benchmark task.
(Training v3.0 training benchmark, source: MLCommons)
In the "big language model" training test that investors are more concerned about, the data submitted by NVIDIA and GPU cloud computing platform CoreWeave set a cruel industry standard for this test. With the concerted efforts of 896 Intel Xeon 8462Y processors and 3584 NVIDIA H100 chips, it only took 10.94 minutes to complete the large language model training task based on GPT-3.
Except for Nvidia, only Intel’s product portfolio received evaluation data on this project. In a system built with 96 Xeon 8380 processors and 96 Habana Gaudi2 AI chips, the time to complete the same test was 311.94 minutes. Using a platform with 768 H100 chips, the horizontal comparison test only takes 45.6 minutes.
(The more chips, the better the data, source: NVIDIA)
Regarding this result, Intel also said that there is still room for improvement. Theoretically, as long as more chips are stacked, the calculation results will naturally be faster. Jordan Plawner, Intel's senior director of AI products, told the media that Habana's computing results will be improved by 1.5 times to 2 times. Plawner declined to disclose the specific price of Habana Gaudi2, saying only that the industry needs a second manufacturer to provide AI training chips, and MLPerf data shows that Intel has the ability to fill this demand.
In the BERT-Large model training that is more familiar to Chinese investors, NVIDIA and CoreWeave pushed the data to an extreme 0.13 minutes. In the case of 64 cards, the test data also reached 0.89 minutes. The current infrastructure of mainstream large models is the Transformer structure in the BERT model.
Source: Financial Associated Press
The above is the detailed content of NVIDIA H100 dominates the authoritative AI performance test, completing large model training based on GPT-3 in 11 minutes. For more information, please follow other related articles on the PHP Chinese website!