


Galaxy AI Network, the answer to transportation capacity in the era of large models
As the value of large AI pre-trained models continues to emerge, the scale of the models is becoming larger and larger. Industry and academia have reached a consensus: In the AI era, computing power is productivity.
Although this understanding is correct, it is not comprehensive. Digital systems have three pillars: storage, computing, and networking, and the same goes for AI technology. If you put aside storage and network computing power, then large models can only stand alone. In particular, network infrastructure adapted to large models has not received effective attention.
Faced with large-scale AI models that frequently require "tens of thousands of cards for training", "tens of thousands of miles of deployment" and "trillions of parameters", network transport capacity is a link that cannot be ignored in the entire intelligent system. The challenges it faces are very prominent, and it is waiting for answers that can break the situation.
Wang Lei, President of Huawei Data Communication Product Line
On September 20, a data communications summit with the theme of "Galaxy AI Network, Accelerating Industry Intelligence" was held during the Huawei Connect Conference 2023. Representatives from all walks of life discussed the transformation and development trends of AI network technology. At the meeting, Wang Lei, President of Huawei’s Data Communications Product Line, officially launched the Galaxy AI network solution. He said that large models make AI smarter, but the cost of training a large model is very high, and the cost of AI talent must also be considered. Therefore, in the intelligentization stage of the industry, only by concentrating on building large computing power clusters and providing intelligent computing cloud services to the society can artificial intelligence be truly penetrated into thousands of industries. Huawei has released a new generation of Galaxy AI network solution. Facing the intelligent era, it builds a new network infrastructure with ultra-high throughput, long-term stability, reliability, elasticity and high concurrency to help AI benefit everyone and accelerate the intelligence of the industry.
Take this opportunity to learn about the network challenges brought by the rise of large models to intelligent computing data centers, and why Huawei Galaxy AI Network is the optimal solution to these problems.
When it comes to the AI era, a model, a piece of data, and a computing unit can be regarded as a starlight. However, only by connecting them together efficiently and stably can a brilliant intelligent world be formed
The explosion of large models triggered a hidden network torrent
We know that the AI model is divided into two stages: training and inference deployment. With the rise of pre-trained large models, huge AI network challenges have also occurred in these two stages.
The first is the training phase of the large model. As the model scale and data parameters become larger and larger, large model training begins to require computing clusters of kilocalorie or even 10,000 kilowatts to complete. This also means that large model training must occur in data centers with AI computing power.
At the current stage, the cost of intelligent computing data centers is very high. According to industry data, the cost of building a cluster with 100P computing power reaches 400 million yuan. Taking a well-known international large model as an example, its daily computing power expenditure during the training process reaches 700,000 US dollars
If the connection capability of the data center network is not smooth, resulting in a large amount of computing resources being lost during network transmission, the losses to the data center and AI models will be immeasurable. On the contrary, if cluster training is more efficient under the same computing power scale, then data centers will gain huge business opportunities. The load rate and other network factors directly determine the training efficiency of the AI model. On the other hand, as the scale of the AI computing power cluster continues to expand, its complexity also increases accordingly, so the probability of failure is also increasing. Building a long-term stable and reliable cluster network is an important pivot for data centers to improve their input-output ratio
Outside of the data center, the value of the AI network can also be seen in the reasoning and deployment scenarios of AI models. The inference deployment of large models mainly relies on cloud services, and cloud service providers must try to serve larger customers with limited computing resources to maximize the commercial value of large models. As a result, the more users there are, the more complex the entire cloud network structure will be. How to provide long-term and stable network services has become a new challenge for cloud computing service providers.
In addition, in the last mile of AI inference deployment, government and enterprise users are faced with the need to improve network quality. In real scenarios, 1% link packet loss will cause TCP performance to drop 50 times, which means that for a 100Mbps broadband, the actual capacity is less than 2Mbps. Therefore, only by improving the network capabilities of the application scenario itself can we ensure the smooth flow of AI computing power and realize truly inclusive AI.
It is not difficult to see from this that in the entire process of the birth, transmission, and application of large AI models, every link faces the challenges and needs of network upgrades. The transportation capacity problem in the era of large models needs to be solved urgently.
The network breakthrough ideas in the intelligent era can extend from starlight to galaxy
The rise of large models has brought about a multi-link, full-process network problem. Therefore, we must take a systematic approach to address this challenge
Huawei has proposed a new network infrastructure for intelligent computing cloud services. The facility needs to support the three capabilities of "high-efficiency training", "non-stop computing power" and "inclusive AI services". These three capabilities cover the entire scenario of AI large models from training to inference deployment. Huawei not only focuses on meeting a single need and upgrading a single technology, but also comprehensively promotes the iteration of AI networks, bringing unique breakthrough ideas to the industry
Specifically, the network infrastructure in the AI era needs to include the following capabilities:
First of all, the network needs to maximize the value of the AI computing cluster in the training scenario. By building a network with ultra-large-scale connection capabilities, we can achieve high efficiency in training large AI models.
Secondly, in order to ensure the stability and sustainability of AI tasks, it is necessary to build long-term and reliable network capabilities to ensure that monthly training is not interrupted, and at the same time, stable delimitation, positioning and recovery at the second level are required. Minimize training interruptions as much as possible. This is the non-stop capacity building of computing power.
Thirdly, during the AI inference deployment process, the network is required to have the characteristics of elasticity and high concurrency, which can intelligently orchestrate massive user flows and provide the best AI landing experience. At the same time, it can resist the impact of network degradation and ensure different AI computing power flows smoothly between regions, which also realizes the capacity building of "inclusive AI services".
Huawei finally launched the Galaxy AI network solution, adhering to this game-breaking idea. This solution integrates dispersed AI technologies and forms a galaxy-like network through powerful computing capabilities
Galaxy AI Network gives a capacity answer to the big model era
During the Huawei Full Connection Conference 2023, Huawei shared its development vision for accelerating the creation of large AI models with large computing power, large storage capacity, and large transportation capacity. The new generation of Huawei's Galaxy AI network solution can be said to be Huawei's solution to large-scale transport capacity in the era of intelligence.
For intelligent data centers, Huawei Galaxy AI Network is the optimal solution based on network power.
Its ultra-high throughput network characteristics can provide important value to the AI cluster in the intelligent computing center to improve the network load rate and enhance training efficiency. Specifically, Galaxy AI network intelligent computing switches have the industry's highest density 400GE and 800GE port capabilities. Only a layer 2 switching network can realize a convergence-free cluster network of 18,000 cards, thus supporting large model training with over one trillion parameters. . Once the networking levels are reduced, it means that the data center can save a lot of optical module costs, while improving the predictability of network risks and obtaining more stable large model training capabilities.
The Galaxy AI network can support network-level load balancing NSLB, increasing the load rate from 50% to 98%, which is equivalent to realizing overclocking operation of the AI cluster, thereby improving the training efficiency by 20%, and meeting the expectations of efficient training
For cloud service manufacturers, Galaxy AI Network can provide stable and reliable computing power guarantee.
In DCI computing room interconnection scenarios, this technology can provide functions such as multi-path intelligent scheduling, automatically identify and proactively adapt to the impact of peak business traffic. It can identify large and small flows from millions of data flows and reasonably allocate them to 100,000 paths to achieve zero congestion in the network and provide elastic guarantee for high-concurrency intelligent computing cloud services
For government and enterprise users, Galaxy AI Network can cope with network degradation problems and ensure universal AI computing power.
It can support elastic anti-degradation capabilities in DCA calculation scenarios. It uses Fillp technology to optimize the TCP protocol, which can increase the bandwidth load rate from 10% to 60% under the condition of 1% packet loss rate, thereby ensuring that data from metropolitan areas The smooth flow of computing power to remote areas accelerates the inclusive application of AI services.
In this way, the network requirements of all aspects of large models from training to deployment are solved. From intelligent computing centers to thousands of industries, they all have the development fulcrum of network-based computing.
In an era of intelligence, a new era of science and technology opened by large models has just begun. Galaxy AI Network provides answers to transportation capacity in the intelligent era
The above is the detailed content of Galaxy AI Network, the answer to transportation capacity in the era of large models. For more information, please follow other related articles on the PHP Chinese website!

For beginners and those interested in business automation, writing VBA scripts, an extension to Microsoft Office, may find it difficult. However, ChatGPT makes it easy to streamline and automate business processes. This article explains in an easy-to-understand manner how to develop VBA scripts using ChatGPT. We will introduce in detail specific examples, from the basics of VBA to script implementation using ChatGPT integration, testing and debugging, and benefits and points to note. With the aim of improving programming skills and improving business efficiency,

ChatGPT plugin cannot be used? This guide will help you solve your problem! Have you ever encountered a situation where the ChatGPT plugin is unavailable or suddenly fails? The ChatGPT plugin is a powerful tool to enhance the user experience, but sometimes it can fail. This article will analyze in detail the reasons why the ChatGPT plug-in cannot work properly and provide corresponding solutions. From user setup checks to server troubleshooting, we cover a variety of troubleshooting solutions to help you efficiently use plug-ins to complete daily tasks. OpenAI Deep Research, the latest AI agent released by OpenAI. For details, please click ⬇️ [ChatGPT] OpenAI Deep Research Detailed explanation:

When writing a sentence using ChatGPT, there are times when you want to specify the number of characters. However, it is difficult to accurately predict the length of sentences generated by AI, and it is not easy to match the specified number of characters. In this article, we will explain how to create a sentence with the number of characters in ChatGPT. We will introduce effective prompt writing, techniques for getting answers that suit your purpose, and teach you tips for dealing with character limits. In addition, we will explain why ChatGPT is not good at specifying the number of characters and how it works, as well as points to be careful about and countermeasures. This article

For every Python programmer, whether in the domain of data science and machine learning or software development, Python slicing operations are one of the most efficient, versatile, and powerful operations. Python slicing syntax a

The evolution of AI technology has accelerated business efficiency. What's particularly attracting attention is the creation of estimates using AI. OpenAI's AI assistant, ChatGPT, contributes to improving the estimate creation process and improving accuracy. This article explains how to create a quote using ChatGPT. We will introduce efficiency improvements through collaboration with Excel VBA, specific examples of application to system development projects, benefits of AI implementation, and future prospects. Learn how to improve operational efficiency and productivity with ChatGPT. Op

OpenAI's latest subscription plan, ChatGPT Pro, provides advanced AI problem resolution! In December 2024, OpenAI announced its top-of-the-line plan, the ChatGPT Pro, which costs $200 a month. In this article, we will explain its features, particularly the performance of the "o1 pro mode" and new initiatives from OpenAI. This is a must-read for researchers, engineers, and professionals aiming to utilize advanced AI. ChatGPT Pro: Unleash advanced AI power ChatGPT Pro is the latest and most advanced product from OpenAI.

It is well known that the importance of motivation for applying when looking for a job is well known, but I'm sure there are many job seekers who struggle to create it. In this article, we will introduce effective ways to create a motivation statement using the latest AI technology, ChatGPT. We will carefully explain the specific steps to complete your motivation, including the importance of self-analysis and corporate research, points to note when using AI, and how to match your experience and skills with company needs. Through this article, learn the skills to create compelling motivation and aim for successful job hunting! OpenAI's latest AI agent, "Open

ChatGPT: Amazing Natural Language Processing AI and how to use it ChatGPT is an innovative natural language processing AI model developed by OpenAI. It is attracting attention around the world as an advanced tool that enables natural dialogue with humans and can be used in a variety of fields. Its excellent language comprehension, vast knowledge, learning ability and flexible operability have the potential to transform our lives and businesses. In this article, we will explain the main features of ChatGPT and specific examples of use, and explore the possibilities for the future that AI will unlock. Unraveling the possibilities and appeal of ChatGPT, and enjoying life and business


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Dreamweaver Mac version
Visual web development tools

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools
