Home  >  Article  >  Technology peripherals  >  Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips

Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips

WBOY
WBOYforward
2023-04-29 11:55:061296browse

At the Google Cloud Next 2022 event in October last year, the OpenXLA project officially surfaced. Google cooperated with the open source AI framework promoted by technology companies including Alibaba, AMD, Arm, Amazon, Intel, Nvidia and other technology companies. Committed to bringing together different machine learning frameworks to enable machine learning developers to proactively choose frameworks and hardware.

On Wednesday, Google announced that the OpenXLA project is officially open source.

Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips

Project link: https://github.com/openxla/xla

By creating a unified machine learning compiler that works with multiple different machine learning frameworks and hardware platforms, OpenXLA can accelerate the delivery of machine learning applications and provide greater code portability. This is a significant project for AI research and applications, and Jeff Dean also promoted it on social networks.

Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips

Today, machine learning development and deployment are impacted by fragmented infrastructure that can be compromised by frameworks, Varies by hardware and use case. This isolation limits the speed at which developers can work and creates barriers to model portability, efficiency, and production.

On March 8, Google and others took a major step toward removing these barriers with the opening of the OpenXLA project, which includes the XLA, StableHLO, and IREE repositories.

OpenXLA is an open source ML compiler ecosystem co-developed by AI/machine learning industry leaders, with contributors including Alibaba, AWS, AMD, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta and Nvidia. It enables developers to compile and optimize models from all leading machine learning frameworks for efficient training and serving on a variety of hardware. Developers using OpenXLA can observe significant improvements in training time, throughput, service latency, and ultimately release and compute costs.

Challenges Facing Machine Learning Technology Facilities

As AI technology enters the practical stage, development teams in many industries are using machine learning to address real-world challenges. Examples include disease prediction and prevention, personalized learning experiences, and exploration of black hole physics.

With the number of model parameters growing exponentially and the amount of computation required by deep learning models doubling every six months, developers are seeking maximum performance and utilization of their infrastructure . A large number of teams are leveraging a variety of hardware models, from energy-efficient machine learning-specific ASICs in the data center to AI edge processors that provide faster response times. Accordingly, in order to improve efficiency, these hardware devices use customized and unique algorithms and software libraries.

But on the other hand, if there is no universal compiler to bridge different hardware devices to the multiple frameworks in use today (such as TensorFlow, PyTorch), people will need to put in a lot of effort to Run machine learning efficiently. In practice, developers must manually optimize model operations for each hardware target. This means using custom software libraries or writing device-specific code requires domain expertise.

This is a paradox, using proprietary technology for efficiency only results in siled, non-generalizable paths across frameworks and hardware resulting in high maintenance costs and in turn vendor lock-in , slowing down the progress of machine learning development.

Solution and Goals

The OpenXLA project provides a state-of-the-art ML compiler that scales across the complexity of ML infrastructure. Its core pillars are performance, scalability, portability, flexibility and ease of use. With OpenXLA, we aspire to realize the greater potential of AI in the real world by accelerating the development and delivery of AI.

OpenXLA aims to:

  • Allows developers to easily compile and optimize any model in their preferred framework for a variety of hardware with a unified compiler API that works with any framework and plugs into dedicated device backends and optimizations .
  • Provides industry-leading performance for current and emerging models, and can also be scaled to multiple hosts and accelerators to meet the constraints of edge deployment and promoted to new model architectures in the future.
  • Building a layered and scalable machine learning compiler platform that provides developers with MLIR-based components that can be reconfigured for their unique use cases, for use with hardware Customized compilation process.

AI/ML Leaders Community

The challenges we face today in machine learning infrastructure are enormous and no one organization can effectively do it alone address these challenges. The OpenXLA community brings together developers and industry leaders operating at different levels of the AI ​​stack—from frameworks to compilers, runtimes, and chips—and is therefore ideally suited to address the fragmentation we see in the ML space.

As an open source project, OpenXLA adheres to the following principles:

  • Equal status: Individuals are equal regardless of affiliation Make a contribution. Technical leaders are those who contribute the most time and energy.
  • Culture of Respect: All members are expected to uphold the project values ​​and code of conduct, regardless of their position in the community.
  • Scalable, efficient governance: Small teams make consensus-based decisions, with clear but rarely used upgrade paths.
  • Transparency: All decisions and rationales should be clearly visible to the public.

OpenXLA Ecosystem: Performance, Scale, and Portability

OpenXLA removes barriers for machine learning developers with a modular tool chain that makes it universal The compiler interface is supported by all leading frameworks, leverages portable standardized model representations, and provides domain-specific compilers with powerful target-specific and hardware-specific optimizations. The toolchain includes XLA, StableHLO, and IREE, all of which leverage MLIR: a compiler infrastructure that enables machine learning models to be represented, optimized, and executed consistently on hardware.

Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips

OpenXLA Key Highlights

Scope of Machine Learning Use Cases

Current usage of OpenXLA spans the range of ML use cases, including full training of models such as DeepMind’s AlphaFold, GPT2 and Swin Transformer on Alibaba Cloud, as well as multi-modal training on Amazon.com LLM training. Customers such as Waymo leverage OpenXLA for in-vehicle real-time inference. Additionally, OpenXLA is used to optimize Stable Diffusion services on local machines equipped with AMD RDNA™ 3.

Best Performance, Out of the Box

OpenXLA eliminates the need for developers to write device-specific code, You can easily speed up model performance. It features overall model optimization capabilities, including simplifying algebraic expressions, optimizing in-memory data layout, and improving scheduling to reduce peak memory usage and communication overhead. Advanced operator fusion and kernel generation help improve device utilization and reduce memory bandwidth requirements.

Easily scale workloads

Developing efficient parallelization algorithms is time-consuming and requires expertise. With features like GSPMD, developers only need to annotate a subset of key tensors, which can then be used by the compiler to automatically generate parallel computations. This eliminates the significant effort required to partition and efficiently parallelize models across multiple hardware hosts and accelerators.

Portability and Option

OpenXLA provides out-of-the-box support for a variety of hardware devices, Including AMD and NVIDIA GPUs, x86 CPUs and Arm architectures, and ML accelerators such as Google TPU, AWS Trainium and Inferentia, Graphcore IPU, Cerebras Wafer-Scale Engine, and more. OpenXLA also supports TensorFlow, PyTorch, and JAX through StableHLO, a portable layer used as an input format for OpenXLA.

flexibility

OpenXLA provides users with the flexibility to manually adjust model hotspots. Extension mechanisms such as custom calls enable users to write deep learning primitives in CUDA, HIP, SYCL, Triton, and other kernel languages ​​to take full advantage of hardware features.

StableHLO

StableHLO is a portability layer between ML frameworks and ML compilers and is a support A set of high-level operations (HLO) operations for dynamics, quantization, and sparsity. Additionally, it can be serialized to MLIR bytecode to provide compatibility guarantees. All major ML frameworks (JAX, PyTorch, TensorFlow) can produce StableHLO. In 2023, Google plans to work closely with the PyTorch team to achieve integration with PyTorch version 2.0.

The above is the detailed content of Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete