Home >Hardware Tutorial >Hardware News >Startup company hardens Nvidia: emulates CUDA on AMD cards, and original programs can be compiled and run directly
Without any modification or conversion, the AMD graphics card can run the original CUDA program!
A British start-up company has launched a CUDA program compilation tool for AMD, which is free for commercial use.
As soon as the tool was released, it aroused widespread heated discussion among netizens and topped the HackerNews hot list.
The tool is called SCALE, and the developer positions it as a GPGPU (General Purpose GPU) programming toolkit.
Currently, 9 programs including the large model framework llama-cpp have passed the test and are running normally.
Unlike other implementation methods, SCALE directly simulates the installation of the CUDA toolkit, and does not need to be converted into other languages to complete the compilation from source.
Therefore, SCALE can also provide support for NVIDIA-specific intermediate languages like inline PTX.
The official website introduction shows that SCALE mainly has three components - compatible nvcc compiler, AMD implementation of CUDA runtime and driver API, and ROCm library.
The compiler can directly compile programs written in CUDA-specific languages including nvcc, inline PTX, etc. into binary codes that can run on AMD GPUs.
ROCm library is used to provide "CUDA-X" API, which is used by SCALE when dealing with libraries such as cuBLAS and cuSOLVER.
The key innovation of SCALE is to accept CUDA programs as-is without having to port them to another language, and is compatible with multiple compilation methods such as nvcc and clang, while existing build tools and scripts (such as cmake) It works fine.
According to the official statement, SCALE is fully compatible with CUDA, eliminating the need for developers to write separate codes for different GPU platforms.
This is very different from the HIP launched by AMD, because HIP rewrites the CUDA code in a certain way, may not be correctly understood when encountering complex macros, and does not support proprietary languages such as inline PTX.
Even the SCALE author believes that HIP cannot solve the CUDA compatibility problem.
In addition, SCALE’s language is a superset of CUDA, providing some optional language extensions that can make it easier and more efficient for developers who want to get rid of nvcc to write GPU code.
The author expressed the hope that in the future developers can only write code once and run it on different hardware platforms, and is working on bridging the compatibility gap between the popular CUDA programming language and other hardware vendors.
Currently, SCALE supports AMD GPU series as follows:
Already supported: gfx1030 (RX6000 series) and gfx1100 (RX7000 series)
"seem to work": gfx1010 (RX5000 series) and gfx1101
Adapting to: gfx900 (RX Vega series)
In addition, the author tested some CUDA open source projects and successfully ran 9 CUDA applications using SCALE.
However, SCALE is a brand new project after all, so the author has also prepared a series of tutorials from installation to compilation, giving different types of sample programs.
The key steps of the tutorial are all accompanied by relevant codes, and even include how to determine the model of your own GPU, which can be said to be very detailed.
If you encounter problems during use, the author also introduces common troubleshooting methods, and also opens a Discord forum to communicate directly with the development team.
The startup that created SCALE is called Spectral Compute. It was founded in the UK in 2018. It claims to have an in-depth understanding of the architecture of CPU and GPU, and its goal is to help developers efficiently utilize computing resources.
Some netizens believe that if SCALE can really have the effect (as advertised), it will challenge NVIDIA's moat and allow AMD to directly compete with it.
그러나 아직 결론을 내리기에는 아직 이르다. 결국 SCALE은 원본 CUDA에 비해 여전히 몇 가지 결함이 있음을 공식적으로 인정합니다.
또한 개발자는 일부 CUDA API 및 기능이 지원되지 않는다는 점을 분명히 밝혔지만 구체적인 목록은 제공하지 않았습니다.
"AMD 솔루션"의 또 다른 단점에 대해 SCALE 팀과 소통했다고 주장한 한 네티즌은 현재 SCALE이 TensorCore를 작동할 수 없으며 이는 FlashAttention 가속 프레임워크가 AMD에서 실행될 수 없음을 의미한다고 말했습니다.
또한 N 카드에는 강력한 행렬 곱셈 장치가 있기 때문에 컴파일하고 실행할 수 있더라도 AMD 카드의 성능은 N 카드만큼 좋지 않을 수 있습니다.
일부 네티즌들은 NVIDIA가 지배적인 이유는 AMD가 GPU의 머신러닝 성능을 높이는 데 투자할 의지가 없기 때문이라고 생각합니다.
효율적으로 실행할 수 있다고 하더라도 AMD 카드가 정말 저렴하고 접근성이 좋은 것인지도 의문입니다.
가장 큰 문제는 기술적으로 실행 가능한지 여부가 아니라 그에 따른 법적 문제라고 믿는 네티즌들의 물결도 있습니다.
이 문제 역시 광범위한 논의를 촉발했지만 아직 결론은 나지 않았습니다.
일부 사람들은 SCALE이 ZLUDA(AMD에서 CUDA 프로그램을 실행하는 또 다른 방법)와 같은 법적 의심을 갖고 있으며 NVIDIA로부터 소송으로 이어질 수 있다고 생각합니다.
특히 NVIDIA의 EULA 약관에 따르면 CUDA SDK는 N 카드에서 실행되는 애플리케이션 개발만 허용하므로 SCALE과 같은 호환 구현이 금지될 수 있습니다.
그러나 즉시 일부 네티즌들은 SCALE이 NVIDIA의 "SDK"를 사용하지 않는다고 말했는데, SDK 사용 계약에 대해 어떻게 이야기할 수 있을까요?
요컨대 기술적 결함이든 법적인 문제이든 이 새로운 도구에 대한 논의는 여전히 진행 중입니다.
유용한지 아닌지에 대한 투표는 개발자의 몫입니다.
참조 링크:
[1]https://docs.scale-lang.com/
[2]https://news.ycombinator.com/item?id=40970560
이 글의 출처는 WeChat 공개 계정: Qubit(ID: QbitAI), 작성자: Cressy
The above is the detailed content of Startup company hardens Nvidia: emulates CUDA on AMD cards, and original programs can be compiled and run directly. For more information, please follow other related articles on the PHP Chinese website!