How to maximize GPU performance-AI-php.cn

Home

Technology peripherals

How to maximize GPU performance

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Aug 31, 2023 pm 05:09 PM

The default way to speed up artificial intelligence projects is to increase the size of the GPU cluster. However, as GPU supply becomes increasingly tight, costs are getting higher and higher. It’s understandable that many AI companies spend more than 80% of the capital raised on computing resources. GPUs are key to AI infrastructure and should be allocated as much of the budget as possible. However, beyond these high costs, there are other ways to improve GPU performance that need to be considered, and it is increasingly urgent to expand GPU clusters

How to maximize GPU performance

Not an easy task, especially as the violent expansion of generative AI leads to a shortage of GPUs. NVIDIA A100 GPUs were among the first GPUs affected and are now extremely scarce, with some versions having lead times of up to a year. These supply chain challenges have forced many to consider the higher-end H100 as an alternative, but obviously at a higher price. For entrepreneurs investing in their own infrastructure to create the next great generative AI solution for their industry, there is a need to squeeze every drop of efficiency out of existing GPUs

Let’s take a look at how enterprises can get more out of their compute investment by proposing changes to the network and storage design of their AI infrastructure

Data Matters

Optimizing the utilization of existing computing infrastructure is an important approach. In order to maximize GPU utilization, the problem of slow data transfer speeds needs to be solved to ensure that the GPU remains running under high load. Some users are experiencing GPU utilization of only 20%, which is unacceptable. As a result, AI teams are looking for the best ways to maximize the return on their AI investments

GPUs are the engine of AI. Just like a car engine needs gasoline to run, a GPU needs data to perform operations. If you limit the data flow, you will limit the performance of the GPU. If a GPU is only 50% efficient, the productivity of the AI team will decrease, the time it takes to complete a project will double, and the return on investment will be halved. Therefore, in infrastructure design, it is important to ensure that the GPU can operate at maximum efficiency and provide the expected computing performance

It is important to note that both the DGX A100 and H100 servers have up to 30 TB of internal storage capacity. However, considering that the average model size is approximately 150 TB, this capacity is insufficient for most deep learning models. Therefore, additional external data storage is required to provide data to the GPU

Storage Performance

AI storage is typically composed of servers, NVMe SSDs, and storage software Composed, they are usually packaged in a simple device. Just like GPUs are optimized to process large amounts of data in parallel with tens of thousands of cores, storage also needs to be high-performance. In artificial intelligence, the basic requirement for storage is to be able to store the entire data set and transfer the data to the GPU at line speed (i.e., the fastest speed the network allows) to keep the GPU running efficiently and saturated. Anything less results in a waste of these very expensive and valuable GPU resources. By delivering data with a speed that can keep up with a cluster of 10 or 15 GPU servers running at full speed, it helps Optimize GPU resources and improve performance across your environment while leveraging your budget as much as possible to get the most from your entire infrastructure

The challenge is, in fact, that there is no storage optimized for AI Providers require many client compute nodes to extract full performance from storage. If you start with one GPU server, you will in turn need many storage nodes to achieve the performance to provision a single GPU server.

Rewritten content: Don’t trust all benchmark results; you can easily get more bandwidth when using multiple GPU servers, but AI relies on storage, no matter what It delivers all performance to a single GPU node whenever needed. Stick with storage that can deliver the ultra-high performance you need, but do it in a single storage node and be able to deliver this performance to a single GPU node. This may limit the market reach, but it is a priority when starting your AI project journey

Network Bandwidth

Increasingly powerful computing power is driving increasing demand for other artificial intelligence infrastructure. Bandwidth requirements have reached new heights, being able to manage the vast amounts of data being sent over the network from storage devices and processed by GPUs every second. Network adapters (NICs) in the storage device connect to switches in the network, which connect to adapters inside the GPU server. NICs can connect storage directly to NICs in 1 or 2 GPU servers without bottlenecks if configured correctly, ensuring bandwidth is high enough to pass the maximum data load from storage to GPUs for a sustained period of time Maintaining saturation within the GPU is key, and in many cases, failure to do this is why we see lower GPU utilization.

GPU Orchestration

Once the infrastructure is in place, GPU orchestration and allocation tools will greatly help teams assemble and allocate resources more efficiently, Understand GPU usage, providing a higher level of resource control, reducing bottlenecks and improving utilization. These tools can only accomplish all of these tasks as expected if the underlying infrastructure can ensure the correct flow of data

In the field of artificial intelligence, data is the key input. Therefore, traditional enterprise flash is not relevant to AI when used for enterprise mission-critical applications (e.g., inventory control database servers, email servers, backup servers). These solutions are built using legacy protocols, and while they have been repurposed for AI, these legacy foundations limit their performance for GPU and AI workloads, drive up prices, and waste money on overly expensive and Unnecessary functions

With the current global shortage of GPUs, coupled with the rapid development of the artificial intelligence industry, finding ways to maximize GPU performance has never been more important— —especially in the short term. As deep learning projects flourish, these methods become several key ways to reduce costs and improve output

The above is the detailed content of How to maximize GPU performance. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Tool Calling in LLMsApr 14, 2025 am 11:28 AM

Large language models (LLMs) have surged in popularity, with the tool-calling feature dramatically expanding their capabilities beyond simple text generation. Now, LLMs can handle complex automation tasks such as dynamic UI creation and autonomous a

How ADHD Games, Health Tools & AI Chatbots Are Transforming Global HealthApr 14, 2025 am 11:27 AM

Can a video game ease anxiety, build focus, or support a child with ADHD? As healthcare challenges surge globally — especially among youth — innovators are turning to an unlikely tool: video games. Now one of the world’s largest entertainment indus

UN Input On AI: Winners, Losers, And OpportunitiesApr 14, 2025 am 11:25 AM

“History has shown that while technological progress drives economic growth, it does not on its own ensure equitable income distribution or promote inclusive human development,” writes Rebeca Grynspan, Secretary-General of UNCTAD, in the preamble.

Learning Negotiation Skills Via Generative AIApr 14, 2025 am 11:23 AM

Easy-peasy, use generative AI as your negotiation tutor and sparring partner. Let’s talk about it. This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining

TED Reveals From OpenAI, Google, Meta Heads To Court, Selfie With MyselfApr 14, 2025 am 11:22 AM

The TED2025 Conference, held in Vancouver, wrapped its 36th edition yesterday, April 11. It featured 80 speakers from more than 60 countries, including Sam Altman, Eric Schmidt, and Palmer Luckey. TED’s theme, “humanity reimagined,” was tailor made

Joseph Stiglitz Warns Of The Looming Inequality Amid AI Monopoly PowerApr 14, 2025 am 11:21 AM

Joseph Stiglitz is renowned economist and recipient of the Nobel Prize in Economics in 2001. Stiglitz posits that AI can worsen existing inequalities and consolidated power in the hands of a few dominant corporations, ultimately undermining economic

What is Graph Database?Apr 14, 2025 am 11:19 AM

Graph Databases: Revolutionizing Data Management Through Relationships As data expands and its characteristics evolve across various fields, graph databases are emerging as transformative solutions for managing interconnected data. Unlike traditional

LLM Routing: Strategies, Techniques, and Python ImplementationApr 14, 2025 am 11:14 AM

Large Language Model (LLM) Routing: Optimizing Performance Through Intelligent Task Distribution The rapidly evolving landscape of LLMs presents a diverse range of models, each with unique strengths and weaknesses. Some excel at creative content gen

See all articles