Home >Technology peripherals >AI >Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training

Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training

王林
王林forward
2023-05-25 22:04:041698browse

Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training

Compiled | Li Shuiqing

Editor | Xin Yuan

Zhidongxi News on May 19th, local time on May 18th, Meta issued an announcement on its official website. In order to cope with the sharp growth in demand for AI computing power in the next ten years, Meta is executing a grand plan - to build a new environment specifically for AI. Generation infrastructure.

Meta announced its latest progress in building next-generation infrastructure for AI, including the first custom chip for running AI models, a new AI-optimized data center design, the first video transcoding ASIC, and the integration of 16,000 GPU, AI supercomputer RSC used to accelerate AI training, etc.

Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training

▲Meta official website’s disclosure of AI infrastructure details

Meta views AI as the company’s core infrastructure. Since Meta broke ground on its first data center in 2010, AI has become the engine of more than 3 billion people using the Meta family of applications every day. From the Big Sur hardware in 2015 to the development of PyTorch, to the initial deployment of Meta's AI supercomputer last year, Meta is currently further upgrading and evolving these infrastructures.

1. Meta’s first generation AI inference accelerator, 7nm process, 102.4TOPS computing power

MTIA (Meta Training and Inference Accelerator) is Meta’s first in-house customized accelerator chip series for inference workloads.

AI workloads are ubiquitous in Meta’s business and are the basis for a wide range of application projects, including content understanding, information flow, generative AI, and ad ranking. As AI models increase in size and complexity, underlying hardware systems need to provide exponential increases in memory and computation while maintaining efficiency. However, Meta found that it was difficult for CPUs to meet the efficiency level requirements required by its scale, so it designed Meta's self-developed training and inference accelerator MTIA ASIC series to address this challenge.

Starting in 2020, Meta designed the first generation MTIA ASIC for its internal workloads. The accelerator uses TSMC's 7nm process, runs at 800MHz, and provides 102.4TOPS computing power at INT8 precision and 51.2TFLOPS computing power at FP16 precision. Its thermal design power (TDP) is 25W.

According to reports, MTIA provides higher computing power and efficiency than CPU. By deploying MTIA chips and GPUs at the same time, it will provide better performance, lower latency and higher efficiency for each workload.

2. Lay out the next generation data center and develop the first video transcoding ASIC

Meta’s next-generation data center design will support its current products while supporting training and inference for future generations of AI hardware. This new data center will be optimized for AI, supporting liquid-cooled AI hardware and a high-performance AI network connecting thousands of AI chips for data center-scale AI training clusters.

According to the official website, Meta’s next-generation data center will also be built faster and more cost-effectively, and will be complemented by other new hardware, such as Meta’s first internally developed ASIC solution, MSVP, designed to serve Meta’s growing Powered by video workloads.

With the emergence of new technology content such as generative AI, people's demand for video infrastructure has further intensified, which prompted Meta to launch a scalable video processor MSVP.

MSVP is Meta’s first in-house developed ASIC for video transcoding. MSVP is programmable and scalable, and can be configured to efficiently support the high-quality transcoding required for on-demand, as well as the low latency and faster processing times required for live streaming. In the future, MSVP will also help bring new forms of video content to every member of the Meta family of applications - including AI-generated content as well as VR (virtual reality) and AR (augmented reality) content.

Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training

▲MSVP architecture diagram

3. The AI ​​supercomputer integrates 16,000 GPUs and supports LLaMA large models to accelerate training iterations

According to Meta’s announcement, its AI Super Computer (RSC) is one of the fastest artificial intelligence supercomputers in the world and is designed to train the next generation of large-scale AI models for new AR tools, content understanding systems, real-time translation technology, etc. Provide motivation.

Meta RSC features 16,000 GPUs, all accessible through a three-level Clos network structure, providing full bandwidth to each of the 2,000 training systems. Over the past year, the RSC has been promoting research projects like LLaMA.

LLaMA is a large language model built and open sourced by Meta earlier this year, with a scale of 65 billion parameters. Meta says its goal is to provide a smaller, higher-performance model that researchers can study and fine-tune for specific tasks without the need for significant hardware.

Meta trained LLaMA 65B and the smaller LLaMA 33B based on 1.4 trillion Tokens. Its smallest model, LLaMA 7B, also uses one trillion Tokens for training. The ability to run at scale allows Meta to accelerate training and tuning iterations, releasing models faster than other businesses.

Conclusion: The application of large model technology forces major manufacturers to accelerate the layout of infrastructure

Meta custom-designs much of its infrastructure primarily because this allows it to optimize the end-to-end experience, from the physical layer to the software layer to the actual user experience. Because the stack is controlled from top to bottom, it can be customized to your specific needs. These infrastructures will support Meta in developing and deploying larger and more complex AI models.

Over the next few years, we will see increased specialization and customization in chip design, specialized and workload-specific AI infrastructure, new systems and tools, and increased efficiency in product and design support. These will deliver increasingly sophisticated models and products built on the latest research, enabling people around the world to use this emerging technology.

Source: Meta official website

The above is the detailed content of Meta continuously throws the AI ​​to accelerate the ultimate move! The first AI inference chip, AI supercomputer specially designed for large model training. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:sohu.com. If there is any infringement, please contact admin@php.cn delete