Home >Technology peripherals >AI >Ten elements of machine learning system architecture

Ten elements of machine learning system architecture

WBOY
WBOYforward
2023-04-13 23:37:011331browse

This is an era of AI empowerment, and machine learning is an important technical means to realize AI. So, is there a universal machine learning system architecture?

Within the cognitive scope of veteran programmers, Anything is nothing, especially for system architecture. However, it is possible to build a scalable and reliable machine learning system architecture if applicable to most machine learning driven systems or use cases. From the perspective of the machine learning life cycle, this so-called universal architecture covers the key machine learning stages, from developing machine learning models, to deploying training systems and service systems to production environments. We can try to describe such a machine learning system architecture from the dimensions of 10 elements.

Ten elements of machine learning system architecture

1. Data and feature engineering pipeline

Provide high-quality data in a given time and generate useful data in a scalable and flexible manner machine learning features. In general, the data pipeline can be separated from the feature engineering pipeline. The data pipeline refers to the extraction, transformation and loading (ETL) pipeline, in which data engineers are responsible for transferring data to storage locations, such as data lakes built on object storage, and feature engineering pipelines focus on converting raw data into data that can help Machine learning features that machine learning algorithms learn faster and more accurately.

Feature engineering is generally divided into two stages. In the first stage, feature engineering logic is usually created by data scientists during the development phase through various experiments in order to find the best set of features, while data engineers or machine learning engineers are responsible for the production of feature engineering pipelines for model training and production Services in the environment provide high-quality feature data.

2. Feature storage

Stores machine learning feature data, performs version management, is used for discovery, sharing and reuse, and provides consistent data and machine learning features for model training and services. Thereby improving the reliability of the machine learning system.

Facing machine learning feature data, feature storage is a persistent storage solution created by the feature engineering pipeline. Feature storage supports model training and serving. Therefore, it is a very important part and an important component of the end-to-end machine learning system architecture.

3. Machine learning model training and retraining pipeline

Run different parameters and hyperparameters for machine learning training, conduct experiments in a simple and configurable way, and record these trainings Various parameters and model performance indicators run. Automatically evaluate, validate, select and record the best performing models into a machine learning model library.

4. Metastorage of training and model

Stores and records machine learning operations, including parameters, indicators, codes, configuration results and trained models, and provides model life cycle management , model annotation, model discovery and model reuse and other functions.

For a complete machine learning system, characterized by engineering, model training and model services, a large amount of metadata can be generated from the data. All this metadata is very useful for understanding how the system works, providing traceability from Data->Features->Model->Server, and providing useful information for debugging when the model stops working.

5. Machine learning model service pipeline

Provide appropriate infrastructure for using machine learning models in production environments, taking into account both full service and latency.

Generally speaking, there are three service modes: batch service, streaming service and online service. Each service type requires completely different infrastructure. Additionally, the infrastructure should be fault-tolerant and automatically scale in response to request and throughput fluctuations, especially for business-critical machine learning systems.

6. Monitor ML models in production

In the production environment, provide data collection, monitoring, analysis, visualization and notification functions when data and model drift and anomalies are discovered, and provide Necessary information to assist in system debugging.

7. Machine Learning Pipeline

Compared to specific machine learning workflows, machine learning pipelines provide a reusable framework that enables data scientists to develop and iterate faster while Maintain high quality code and reduce production time. Some machine learning pipeline frameworks also provide orchestration and architectural abstraction capabilities.

8. Workflow orchestration

Workflow orchestration is the key component of integrating an end-to-end machine learning system, coordinating and managing the dependencies of all these key components. Workflow orchestration tools also provide features such as logging, caching, debugging, and retrying.

9. Continuous Integration/Continuous Training/Continuous Delivery (CI/CT/CD)

Continuous testing and continuous integration refer to continuously training new models with new data and upgrading model performance when needed , and continuously serve production environments and deploy models in a secure, agile, and automated manner.

10. End-to-end quality control for data and models

In each stage of the end-to-end machine learning workflow, reliable data quality checks, model quality checks, and data and concept drift detection need to be embedded , to ensure that the machine learning system itself is reliable and trustworthy. These quality control checks include descriptive statistics, overall data shape, missing data, duplicate data, nearly constant features, statistical tests, distance metrics, and model prediction quality, among others.

The above can be called the 10 elements of machine learning system architecture. In our practice, the overall workflow should remain roughly the same, but some elements of it may need to be tweaked and customized.

How to adjust the system architecture of machine learning?

How to streamline architectural elements at the beginning of product design?

How to maintain the continuity of the original system architecture when introducing a machine learning system?

The above is the detailed content of Ten elements of machine learning system architecture. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete