


Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips
At the Google Cloud Next 2022 event in October last year, the OpenXLA project officially surfaced. Google cooperated with the open source AI framework promoted by technology companies including Alibaba, AMD, Arm, Amazon, Intel, Nvidia and other technology companies. Committed to bringing together different machine learning frameworks to enable machine learning developers to proactively choose frameworks and hardware.
On Wednesday, Google announced that the OpenXLA project is officially open source.
Project link: https://github.com/openxla/xla
By creating a unified machine learning compiler that works with multiple different machine learning frameworks and hardware platforms, OpenXLA can accelerate the delivery of machine learning applications and provide greater code portability. This is a significant project for AI research and applications, and Jeff Dean also promoted it on social networks.
Today, machine learning development and deployment are impacted by fragmented infrastructure that can be compromised by frameworks, Varies by hardware and use case. This isolation limits the speed at which developers can work and creates barriers to model portability, efficiency, and production.
On March 8, Google and others took a major step toward removing these barriers with the opening of the OpenXLA project, which includes the XLA, StableHLO, and IREE repositories.
OpenXLA is an open source ML compiler ecosystem co-developed by AI/machine learning industry leaders, with contributors including Alibaba, AWS, AMD, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta and Nvidia. It enables developers to compile and optimize models from all leading machine learning frameworks for efficient training and serving on a variety of hardware. Developers using OpenXLA can observe significant improvements in training time, throughput, service latency, and ultimately release and compute costs.
Challenges Facing Machine Learning Technology Facilities
As AI technology enters the practical stage, development teams in many industries are using machine learning to address real-world challenges. Examples include disease prediction and prevention, personalized learning experiences, and exploration of black hole physics.
With the number of model parameters growing exponentially and the amount of computation required by deep learning models doubling every six months, developers are seeking maximum performance and utilization of their infrastructure . A large number of teams are leveraging a variety of hardware models, from energy-efficient machine learning-specific ASICs in the data center to AI edge processors that provide faster response times. Accordingly, in order to improve efficiency, these hardware devices use customized and unique algorithms and software libraries.
But on the other hand, if there is no universal compiler to bridge different hardware devices to the multiple frameworks in use today (such as TensorFlow, PyTorch), people will need to put in a lot of effort to Run machine learning efficiently. In practice, developers must manually optimize model operations for each hardware target. This means using custom software libraries or writing device-specific code requires domain expertise.
This is a paradox, using proprietary technology for efficiency only results in siled, non-generalizable paths across frameworks and hardware resulting in high maintenance costs and in turn vendor lock-in , slowing down the progress of machine learning development.
Solution and Goals
The OpenXLA project provides a state-of-the-art ML compiler that scales across the complexity of ML infrastructure. Its core pillars are performance, scalability, portability, flexibility and ease of use. With OpenXLA, we aspire to realize the greater potential of AI in the real world by accelerating the development and delivery of AI.
OpenXLA aims to:
- Allows developers to easily compile and optimize any model in their preferred framework for a variety of hardware with a unified compiler API that works with any framework and plugs into dedicated device backends and optimizations .
- Provides industry-leading performance for current and emerging models, and can also be scaled to multiple hosts and accelerators to meet the constraints of edge deployment and promoted to new model architectures in the future.
- Building a layered and scalable machine learning compiler platform that provides developers with MLIR-based components that can be reconfigured for their unique use cases, for use with hardware Customized compilation process.
AI/ML Leaders Community
The challenges we face today in machine learning infrastructure are enormous and no one organization can effectively do it alone address these challenges. The OpenXLA community brings together developers and industry leaders operating at different levels of the AI stack—from frameworks to compilers, runtimes, and chips—and is therefore ideally suited to address the fragmentation we see in the ML space.
As an open source project, OpenXLA adheres to the following principles:
- Equal status: Individuals are equal regardless of affiliation Make a contribution. Technical leaders are those who contribute the most time and energy.
- Culture of Respect: All members are expected to uphold the project values and code of conduct, regardless of their position in the community.
- Scalable, efficient governance: Small teams make consensus-based decisions, with clear but rarely used upgrade paths.
- Transparency: All decisions and rationales should be clearly visible to the public.
OpenXLA Ecosystem: Performance, Scale, and Portability
OpenXLA removes barriers for machine learning developers with a modular tool chain that makes it universal The compiler interface is supported by all leading frameworks, leverages portable standardized model representations, and provides domain-specific compilers with powerful target-specific and hardware-specific optimizations. The toolchain includes XLA, StableHLO, and IREE, all of which leverage MLIR: a compiler infrastructure that enables machine learning models to be represented, optimized, and executed consistently on hardware.
OpenXLA Key Highlights
Scope of Machine Learning Use Cases
Current usage of OpenXLA spans the range of ML use cases, including full training of models such as DeepMind’s AlphaFold, GPT2 and Swin Transformer on Alibaba Cloud, as well as multi-modal training on Amazon.com LLM training. Customers such as Waymo leverage OpenXLA for in-vehicle real-time inference. Additionally, OpenXLA is used to optimize Stable Diffusion services on local machines equipped with AMD RDNA™ 3.
Best Performance, Out of the Box
OpenXLA eliminates the need for developers to write device-specific code, You can easily speed up model performance. It features overall model optimization capabilities, including simplifying algebraic expressions, optimizing in-memory data layout, and improving scheduling to reduce peak memory usage and communication overhead. Advanced operator fusion and kernel generation help improve device utilization and reduce memory bandwidth requirements.
Easily scale workloads
Developing efficient parallelization algorithms is time-consuming and requires expertise. With features like GSPMD, developers only need to annotate a subset of key tensors, which can then be used by the compiler to automatically generate parallel computations. This eliminates the significant effort required to partition and efficiently parallelize models across multiple hardware hosts and accelerators.
Portability and Option
OpenXLA provides out-of-the-box support for a variety of hardware devices, Including AMD and NVIDIA GPUs, x86 CPUs and Arm architectures, and ML accelerators such as Google TPU, AWS Trainium and Inferentia, Graphcore IPU, Cerebras Wafer-Scale Engine, and more. OpenXLA also supports TensorFlow, PyTorch, and JAX through StableHLO, a portable layer used as an input format for OpenXLA.
flexibility
OpenXLA provides users with the flexibility to manually adjust model hotspots. Extension mechanisms such as custom calls enable users to write deep learning primitives in CUDA, HIP, SYCL, Triton, and other kernel languages to take full advantage of hardware features.
StableHLO
StableHLO is a portability layer between ML frameworks and ML compilers and is a support A set of high-level operations (HLO) operations for dynamics, quantization, and sparsity. Additionally, it can be serialized to MLIR bytecode to provide compatibility guarantees. All major ML frameworks (JAX, PyTorch, TensorFlow) can produce StableHLO. In 2023, Google plans to work closely with the PyTorch team to achieve integration with PyTorch version 2.0.
The above is the detailed content of Unified AI development: Google OpenXLA is open source and integrates all frameworks and AI chips. For more information, please follow other related articles on the PHP Chinese website!

Since 2008, I've championed the shared-ride van—initially dubbed the "robotjitney," later the "vansit"—as the future of urban transportation. I foresee these vehicles as the 21st century's next-generation transit solution, surpas

Revolutionizing the Checkout Experience Sam's Club's innovative "Just Go" system builds on its existing AI-powered "Scan & Go" technology, allowing members to scan purchases via the Sam's Club app during their shopping trip.

Nvidia's Enhanced Predictability and New Product Lineup at GTC 2025 Nvidia, a key player in AI infrastructure, is focusing on increased predictability for its clients. This involves consistent product delivery, meeting performance expectations, and

Google's Gemma 2: A Powerful, Efficient Language Model Google's Gemma family of language models, celebrated for efficiency and performance, has expanded with the arrival of Gemma 2. This latest release comprises two models: a 27-billion parameter ver

This Leading with Data episode features Dr. Kirk Borne, a leading data scientist, astrophysicist, and TEDx speaker. A renowned expert in big data, AI, and machine learning, Dr. Borne offers invaluable insights into the current state and future traje

There were some very insightful perspectives in this speech—background information about engineering that showed us why artificial intelligence is so good at supporting people’s physical exercise. I will outline a core idea from each contributor’s perspective to demonstrate three design aspects that are an important part of our exploration of the application of artificial intelligence in sports. Edge devices and raw personal data This idea about artificial intelligence actually contains two components—one related to where we place large language models and the other is related to the differences between our human language and the language that our vital signs “express” when measured in real time. Alexander Amini knows a lot about running and tennis, but he still

Caterpillar's Chief Information Officer and Senior Vice President of IT, Jamie Engstrom, leads a global team of over 2,200 IT professionals across 28 countries. With 26 years at Caterpillar, including four and a half years in her current role, Engst

Google Photos' New Ultra HDR Tool: A Quick Guide Enhance your photos with Google Photos' new Ultra HDR tool, transforming standard images into vibrant, high-dynamic-range masterpieces. Ideal for social media, this tool boosts the impact of any photo,


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 English version
Recommended: Win version, supports code prompts!

Atom editor mac version download
The most popular open source editor