Home > Article > Technology peripherals > The real light of domestic graphics cards! An in-depth interpretation of Moore Thread's domestic GPU, AI and Metaverse developments
1. A brief history of Moore’s thread: Lightspeed entrepreneurship attacks across the board
Nowadays, we already have relatively mature independent CPU processors, NAND flash memory, DRAM memory, and OS operating systems. However, as a very critical part of the computing platform, GPU graphics cards have always been seriously lacking, mainly Not only is it extremely difficult to design hardware, but it is even more difficult to cultivate the ecosystem, which cannot be achieved overnight.
There are actually quite a few domestic GPU companies, but many of them are limited to specific industry fields or are oriented to high-performance computing. Those who truly dare to make a comprehensive layout and enter the consumer market cannot but mention MooreThread. ).
On May 31, Moore Thread sent an invitation to Kuai Technology for the 2023 summer conference. We originally thought that the core of this time would be the new generation of games and server graphics cards. It turned out that our layout was too small. Moore Thread’s The layout is much more than that.
From entertainment and creation to AI and cloud computing, from localized digital office to the metaverse, Moore Thread has brought new game graphics cards and complete machines, DX11 drivers, physics engines, and cloud desktop solutions in one breath Breakthrough progress in many aspects such as machines, development tools, code porting tools, AI content creation, metaverse and digital humans has opened a new page for domestically produced GPUs and ecology.
I believe everyone is familiar with the name Moore Thread. Here is a brief introduction to its history.
Moore Thread was established in October 2020. It is only a little over two and a half years old today, but it has achieved remarkable results.
The founder of Moore Thread is Mr. Zhang Jianzhong, the former NVIDIA global vice president and general manager of China. He joined NVIDIA in 2005 and led the team to develop a complete ecosystem of NVIDIA GPUs in China and promote China to become an NVIDIA global The most important market, bar none.
The core founding team of Moore Thread is basically from NVIDIA. It has a complete high-end chip talent team, comprehensively covering GPU chip IP research and development, system software and hardware design, ecological construction and marketing, etc.
In 2022, Moore Thread launched the GPU unified system architecture MUSA, and released and mass-produced two full-featured GPU chips, "Sudi" and "Chunxiao". This is also the only modern GPU architecture in China with built-in graphics. A full-featured high-end GPU chip with four engines: rendering, video encoding and decoding, AI computing acceleration, physical simulation, and scientific computing.
In terms of products, Moore Thread also quickly completed its full line layout. The speed and scope are as fast as cheating.
Hardware includes desktop graphics card MTT S10/S30/S50 for digital office, the first domestic game graphics card MTT S80, and full-featured MTT S2000/S3000 for data centers.
In terms of software, there are the first metaverse computing platform MTVERSE, self-developed GPU physics engine AlphaCore, digital human solution DIGITALME, AIGC content generation platform Mobi Ma Liang , etc.
MTT S30 complete machine
MTT S3000 server eight-card parallel
In terms of ecology, Moore Thread has reached strategic cooperation with more than 200 partners, especially in the domestic digital office of complete machines and boards, achieving high-quality delivery.
The number of partners of the PES Perfect Experience System Alliance is also growing, covering domestic and foreign mainstream CPU manufacturers, operating system manufacturers, OEM manufacturers, software service manufacturers, cloud service manufacturers, and system software developers.
2. MTT S70, DX11 and the whole machine: both soft and hard, truly playable
At this press conference, I believe that what everyone is most concerned about is the newly released second gaming graphics card MTT S70. However, before introducing it, let us briefly review the first MTT S80 that bravely entered the gaming market.
MTT S80 is based on the GPU chip code-named "Chunti", integrating 20 billion transistors, equipped with 4096 MUSA architecture cores, 128 Tensor tensor cores, and built-in MUSA intelligent multimedia engine 2.0 (H.264/H.265 /AV1 codec), MUSA Security Engine 1.0, MUSA Multi-bit Virtualization Engine (SR-IOV).
The core frequency is 1.8GHz, the FP32 floating-point computing power reaches 14.4TFOPS (14.4 trillion operations per second), and the INT8 integer computing power reaches 57.6TOPS (57.6 trillion operations per second).
The device uses 16GB GDDR6 high-capacity video memory with 256-bit width, its equivalent frequency is 14GHz, and the bandwidth is up to 448GB/s.
The system interface is the first to use PCIe 5.0 x16 and the display output supports DP 1.2a, HDMI 2.1, and 8K30. It is also the first domestic graphics card product to support Windows environment and DirectX graphics interface.
MTT S70 can be regarded as the younger brother of MTT S80. The overall shape is basically the same, it is still a refined and tough three-fan radiator, and the specifications have been streamlined.
The number of cores has been reduced to 3584, the core frequency has been reduced to 1.6GHz, the FP32 floating point computing power has reached 11.2TFLOPS, the system interface has been changed to PCIe 4.0 x16, and it supports four-channel 8K30 ultra-high-definition display output.
But what is more peculiar is the video memory. The capacity is set to an unprecedented 7GB, the bit width corresponds to 224-bit, the equivalent frequency is still 14GHz, and the bandwidth is 392GB/s.
Dong Longfei, vice president of Moore Thread and general manager of the product division, said One of Moore Thread’s concepts in making high-end cards is to use real materials. The S80 provides 16GB of large-capacity video memory, which is very suitable for high-resolution games and AI. etc. scenarios, but the cost is also high, so we made the S70 7GB which is more cost-effective.
Currently, MTT S70 7GB is already on the market, priced at 2,499 yuan.
Compared with the iterations in hardware, the progress in software such as drivers and games is even more gratifying.
In the nearly half a year since MTT S80 was released, the Windows driver has been upgraded five times, and the number of fully supported games exceeds 60 (of course more games can be played but have not been fully adapted). In Internet cafes More than 20 of the 50 most popular games have been adapted.
Game performance has improved by about 50% on average since its release, especially the performance of mainstream online games such as "League of Legends", "CrossFire", and "DOTA2", and "NBA2K Online2" The performance has even reached about 2.5 times that of the early days.
At the same time, more than 50 motherboards and more than 30 monitors have received good support.
On this basis, Moore Thread’s support for DX11 makes us even more excited.
Different from the practice of Intel graphics cards adapting to DX12 first, and then moving downwards to DX11 and DX 9, Moore Thread, which serves the Chinese market and Chinese players, chose to start with DX9, which has the broadest user base, to satisfy more players’ urgent needs.
Currently, although Moore thread graphics cards cannot ensure that all DX9 games run well, there are many reasons, including non-standardization of game development, game optimization for graphics cards from other manufacturers, insufficient matching of foreign games with domestic hardware, etc. But has been completed with 100% support for DX9.
At present, Moore Thread has begun a full sprint towards DX11, becoming the first Chinese GPU company to truly support DX11 games.
According to the plan, Moore Thread will release the DX11 community version driver in late June, and the first batch will support 3A games such as "Genshin Impact" and "DOTA2".
At the same time, Moore Thread launched "Alpha Action" in the "Mocha Players" community to gather more players to use and give feedback on the DX11 Community Edition driver to accelerate the iteration of the driver.
According to Moore Thread MTT S80 and S70 graphics card product manager Ma Jian, the current progress of the DX11 driver is about 80%, and Moore Thread will speed up the steps to download the official version.
For ray tracing and DLSS-like super-resolution technology, Moore threads are also being laid out.
Considering that MTT S series game graphics cards are a new thing after all, and the compatibility of software and hardware is still in its infancy, in order to facilitate users to use them more stably and maximize their potential, Moore Thread also specially released the complete product this time " Intelligent Entertainment Rubik's Cube".
Inside the high-looking appearance, the Intelligent Entertainment Cube can be equipped with two graphics cards, MTT S80/S70, and is pre-installed with selected game center to ensure that it can be played upon startup.
At the same time, it also has a built-in PES System Management Center to facilitate real-time monitoring of system status; Link to cloud applications, so you can experience the latest progress at any time.
Moore Thread will also open Ubuntu driver download for users to support learning and application development in the field of AI computing.
At the press conference, we also experienced the entire Intelligent Cube machine based on MTT S70 graphics card.
The exquisite and compact overall design is eye-catching. The first batch of adapted DX11 games are quite smooth. The picture quality and frame rate can fully meet the needs of mainstream players. At the same time, in terms of development and computing It’s also remarkable.
3. Mobi Ma Liang: AI is supreme and understands Chinese better
AI is undoubtedly the hottest topic at the moment, and to achieve good AI applications, everything from hardware computing power to creative platforms is indispensable.
Moore Thread has regarded AIGC as its core development direction from the beginning, and now officially launches the AIGC content creation platform "Mobi Ma Liang" integrating software and hardware to provide users with a zero-cost AI creation platform.
Mobi Ma Liang uses the industry's cutting-edgemulti-modal pre-training large model/generative diffusion model.
First, the multi-modal pre-training model conducts comparative learning training through massive image and text data, grasps the association between images and text, and builds image and text encoders.During the generation process, the encoder will implicitly encode the input text first, and combine it with other conditional inputs such as pictures, semantics, image masks, etc., and the generation model and decoder will finally generate the image.
It supports
bilingual text descriptions in Chinese and English , and can generate multiple pictures at one time;
SupportPicture generation, making the picture more accurate through edge detection, bone detection, depth detection, etc.;
SupportsMultiple models, multiple styles, multiple artists, including universal, portrait, 2.5D, two-dimensional, etc.;
SupportSensitive content filtering, providing a safer creative environment;
SupportsMany personalized functions, such as simple drawing generation, similar generation, partial replacement, edge expansion, high-definition super-resolution, etc., you can publish and share your works in the gallery.
It is particularly worth mentioning that Mobi Ma Liang has a more accurate and in-depth understanding of Chinese and Chinese culture, which facilitates the creation of Chinese penmanship.
Mobi Maliang AIGC platform has been launched for internal testing. It provides multiple access methods, which can be logged in through the Web and small programs, and can also be called remotely by users through rich APIs.
For users who need a complete solution, Moore Thread also providesprivatized deployment capabilities, including GPU clusters, heterogeneous computing power scheduling platforms, API interfaces, and sample applications similar to Mobi Ma Liang .
The focus of Moore Thread’s metaverse strategy is not just AIGC, but to provide a way to provide metacomputing power around people, scenes and content. Although the concept of the metaverse is not as popular as before, according to Moore Threads, the metaverse has not faded away, but needs better computing platforms and more reasonable application scenarios to promote it. It is a long-term project.
To this end, Moore Thread has upgraded the MTVERSE metaverse platform and now supports cloud real-time rendering.
MTVERSE is a Metaverse platform that provides scalable performance, real-time rendering and simulation, and AI-driven diversified computing power support.
Leading third-party IDC service provider 21Vianet has taken the lead in deploying Moore thread kilo-calorie GPU computing power cluster in the cloud, and combined the MTVERSE platform with Unreal Engine and cloud rendering streaming technology. Provide computing acceleration for 51WORLD's 51Meet Yuanverse high-precision open platform.
This is the first Metaverse application to realize localized closed-loop. With multiple people concurrently, users can enjoy a low-latency, high-fidelity, and immersive Metaverse experience.
In addition, metaverse applications such as Migu Yuanverse, Zhihui Yunzhou Video Twins, and Panorama 3D Reconstruction are also being updated one after another.
In addition, Moore Thread has also upgraded the DIGITALME digital human solution, which can be used in live broadcast, social networking, film and television animation, office, entertainment and other scenarios.
The DIGITALME solution includes the "Nuwa" digital human generator, the "Painted Skin" expression drive engine, the "Souying" action drive engine, and the "Answer" dialogue system.
Among them, "Answer" has upgraded two major main capabilities. One is to interact naturally with people through voice to achieve "being able to listen and speak"; the other is intelligent question and answer based on large language models to achieve "being able to think and speak" There is something”.
During the press conference, Moore Thread demonstrated two digital human product solutions-2D broadcast digital human and 3D interactive digital human.
What’s interesting is that the online part of this conference was hosted by the digital version of Zhang Jianzhong. Many people said that they didn’t notice any clues until they watched the conference.
4. Cloud desktop and digital office: new upgrade to reduce costs and increase efficiency
At the beginning of this year, Moore Thread launched cloud desktop products and solutions based on the multi-function server GPU MT S2000, including vPC cloud desktop virtualization GPU product MT vGPU 1.0, MT GPU pass-through, MT GPU acceleration protocol encoding, etc.
Moore Thread also jointly released the "New GPU Cloud Desktop Development White Paper" under the leadership of China Academy of Information and Communications Technology, China Mobile Cloud and China Telecom Research Institute, clearly defining the experience standards for cloud desktop scenarios. .
Based on this standard, in the four main scenarios of video playback, web browsing, Office, and education and teaching software, a single server based on Moore Thread MTT S2000 can support more than 40 concurrent high-definition users at the same time.
Compared with the traditional CPU cloud desktop solution, its performance can be improved by nearly 5 times, and the overall TCO cost is reduced by more than 60%.
At this conference, the cloud desktop product MT vGPU was upgraded to the new version 2.1. There are four main changes:
First, newAdded support for MTT S3000 graphics card, the number of concurrent virtual machines on a single card reaches 28, and the performance is improved by up to 40%;
The second is new support for GPU super-resolution technology and SR-IOV virtualization. The former can double the number of virtual machines, and the latter can provide better QoS, isolation and security;
The third is the overall picture qualityUpgraded from 1080p to 4K;
Fourth is through driver updates, added support for Windows Server server systems, fully supports H.264, H.265, AV1 video codecs, and supports more browsers and more video players.
At the same time, Moore Thread also launched MCCX VDI cloud desktop all-in-one machine, which is a complete end-to-end delivery solution including servers, thin terminals, and software.
It is mainly aimed at the education and office fields, and also has a special edition for education reform and an enhanced edition for office experience.
Among them, the Education Reform Special Edition can effectively accelerate 19 education reform software, such as Tello Edu, Code Craft, etc.
The Office Experience Enhanced Edition is customized and optimized for more than 60 office software, such as Office and WPS office software, Adobe Reader PDF reader, video conferencing, WinRAR decompression software, etc.
At present, Moore Thread's cloud desktop solution has been adapted to the products of more than 10 customers including Tianyi Cloud Computer and Mobile Cloud Computer , and has also cooperated with Sangfor, H3C Information, Huayun , Kuzhan Technology, and Tingyu Technology have completed product introduction and will be implemented in all walks of life.
At the same time, Moore Thread’s domestic digital office solutions have also been fully upgraded.
Moore thread has taken the lead in supporting the complete functions of OpenGL 4.0 and Vulkan 1.3, and 100% passed the interface compatibility test. It also supports Tessellation surface subdivision and other graphics features, providing More refined geometric texture effects.
While the domestic ecology is prospering and developing, there are various solutions for both CPU processors and OS operating systems. There are hundreds of combinations of each other, and the difficulty and complexity of adaptation are very high.
To this end, Moore Thread supports the DKMS dynamic kernel module, which facilitates and quickly adapts to various CPU OS version combinations, and the development efficiency can be improved dozens of times.
Currently, Moore Thread’s office solutions have successfully adapted to domestic operating systems such as Kirin, openKylin, Tongxin, Deepin, Ningsi, Zhongke Founder, Puhua, etc., and have taken the lead in cooperating with Tongxin UOS, Kirin OS Completed comprehensive compatibility certification and became the first domestic GPU company to pass Tongxin UHQL quality certification.
In addition, Moore thread GPU has accelerated nearly a hundred domestic applications, including office, video conferencing, audio and video, browser, video editing, design, GIS, etc.
5. Development tools: Zero-cost transplantation to get CUDA
Developing GPUs and graphics cards is very difficult, and software development and ecological promotion are even more difficult. In particular, the global GPU industry has been almost monopolized by NVIDIA and its CUDA. It is difficult for AMD and Intel to shake their positions, let alone in the market. There are almost no domestic manufacturers in this area.
In 2022, Moore Thread launched the metacomputing unified system architecture "MUSA", which is quite similar to CUDA. It includes a unified programming model, software runtime library, driver framework, instruction set architecture, The chip architecture can be said to provide a complete solution from the bottom layer of hardware to software development.
Revolving around the MUSA architecture, Moore Thread announced a series of important technical updates this time.
The first is the software toolkit MUSA Toolkit 1.0.
It includes MUSA driver (general computing/graphics rendering/multimedia/multi-card interconnection), runtime library, C standard library, compiler, AI acceleration library, template library, algorithm library, general computing library, mathematics library, communication Library, multimedia library, etc. are extremely rich.
It can be said that this software toolkit provides developers with a full set of one-stop in-depth services. It can call the hardware capabilities of Moore thread GPU from different angles as needed, thereby fully releasing its computing power and graphics capabilities.
The second is code migration tool MUSIFY.
It can quickly migrate existing CUDA programs to the MUSA platform and complete automatic transplantation of CUDA code at zero cost.
After the automatic transplantation is completed, developers can complete hotspot analysis and targeted optimization in a short time, greatly shortening the migration and optimization cycle, saving time, effort, trouble, and worry.
In the past, this type of transplantation required hundreds of man-days of development costs, but now with MUSIFY, it only takes a few or more than a dozen man-days.
In the current environment where GPU ecological development is almost all around CUDA and optimized specifically for it, being able to achieve quick and easy transplantation and ensure performance is undoubtedly the most reasonable way to break the situation.
The third is the open source MT PyTorch AI framework.
Based on the Moore thread MUSA, developers can reuse a large number of model operators from the PyTorch open source community, reduce development costs, and support inference of various models, covering CV, NLP, TTS voice, AIGC, digital human and other fields. It can run typical large model distributed multi-card inference such as ChatGLM, Stable Diffusion, and LLaMA.
Using distributed training technologies such as data parallelism, model parallelism, and ZERO, MT PyTorch can complete the training of simple basic models and NLP language models with typical Transformer structures.
Fourth is a new version of the real-time fluid simulation tool Catalyst FX.
It is based on Moore Thread's self-developed multi-platform physics engine AlphaCore, which can directly produce fluid effects in Houdini without changing the original workflow. Compared with the native PyroFX, the performance is improved by 5 -10 times.
AlphaCore has deeply optimized the DX11 Compute Shader calculation version. In terms of fluid dynamics simulation, the performance of MTT S80 running Catalyst FX has reached more than twice that of mainstream graphics cards in the market.
In addition, compared to the traditional Houdini Vellume production process, VeraFiber, a flexible body simulation tool accelerated by Moore's thread GPU, can increase the solving efficiency to 3-5 times.
Currently, Catalyst FX and VeraFiber have completed the development of the Houdini plug-in interface, and the beta test version of the Houdini plug-in will be available for download on June 6.
In terms of application cooperation, the Catalyst FX Houdini version plug-in has completed delivery docking with MOREVFX, a famous domestic movie post-production special effects production company, and VeraFiber has been invested by NetEase Games’ CG animation production team DOVFX Shuhai CultureSuccessfully used in cloth and hair simulation of complex characters in game CG titles.
Buyu Animation, Sunac Animation, Chasing Light Animation, Pingta Studio, etc. are also ecological partners of Moore Thread AlphaCore.
In order to gather the power of developers and expand the ecosystem, Moore Thread also launched the MUSA Community Developer Program.
Moore Thread provides partners and developers with complete resources including MUSA development tools, programming guides, series of tutorials, open source frameworks and model libraries, etc.
Moore Threads will work with third-party communities to promote the development of new algorithm models, computing systems and platforms.
6. Conclusion: Domestic light has a promising future
At the beginning of the birth of Moore thread, many people were not very optimistic about it. After all, in the current environment and industry situation, with international giants like NVIDIA almost monopolizing the situation, re-developing a domestically produced GPU would still be a problem. It is almost unimaginable to achieve good driver, software and hardware compatibility, establish a complete ecosystem, fully release various computing and graphics performance, and widely commercialize it.
It can be said that to a certain extent, making a GPU graphics card is much more difficult than making a CPU processor.
However, In just over two years, the performance of Moore Threads is worth letting more people know about them——
Build a unified system architecture and create a modern GPU architecture with full coverage of graphics rendering, accelerated computing, display and codec, AI, etc.;
Hardware products cover desktops, workstations, servers, clouds and other scenarios, and are quickly commercialized. In particular, they have the courage to bring game graphics cards to the public users and accept real tests and feedback;
Drives rapid iteration of development, steadily expands game and hardware compatibility, continues to improve performance, and unleashes potential;
Development tools and software products are constantly enriched, providing developers and users with a complete set of solutions;
Ecological construction continues to expand, with partners covering all walks of life...
As a new force in the GPU industry, Moore Threads has risen rapidly and built a rich product line, extending its reach to graphics, computing, AI and other levels and various scenarios. At the same time, it spares no effort to recommend and pattern in terms of development and ecology. The size is even more amazing, and the speed of breaking the situation is astonishing.
In fact, in the current environment, Moore Thread has undoubtedly chosen the most difficult path, building a complete set of solutions and ecology almost from scratch, which is destined to be extremely difficult, but once a real breakthrough is achieved, this will be the real Those who can independently control their future destiny.
The step is too big, will it...? In this regard, Moore Thread has a clear and long-term understanding, and has a clear positioning and direction from the beginning of its business.
Dong Longfei, vice president of Moore Thread and general manager of the product division, said bluntly that as a chip company, Moore Thread has to do more than just launch a few cards, but to fully realize the big picture of the integration of graphics computing and AI. The trend is to start building from the underlying architecture of the chip, and use software-level acceleration to fully unleash the functions and potential of the underlying architecture, thus forming a large industry.
As a rising star, Moore Thread still has many shortcomings and there are still many areas that need to be made up for. However, through more than two years of actual combat performance, I believe everyone has considerable understanding and considerable confidence in it. It also has more expectations for the future.
The above is the detailed content of The real light of domestic graphics cards! An in-depth interpretation of Moore Thread's domestic GPU, AI and Metaverse developments. For more information, please follow other related articles on the PHP Chinese website!