Home  >  Article  >  Operation and Maintenance  >  An SRE who cannot build a data asset system is not a good maintenance person.

An SRE who cannot build a data asset system is not a good maintenance person.

WBOY
WBOYforward
2023-07-22 15:33:511068browse

1. Understanding data assets

1. Data assets - enterprise IT value

An SRE who cannot build a data asset system is not a good maintenance person.Picture

As shown in the figure, when data assetization is not implemented, data may be in a discrete state, data production and consumption are not unified, and data islands or zero benefits may easily occur.

After constructing data assetization, we integrate data from different channels, construct a unified data source, or a process link for data collection, storage, and analysis, and then unify the corresponding data structure, data relationship, and consumption outlet.

After the operation data is collected and compiled, it can serve its own decision-making and business processes.

2. Data assets - taking the operation and maintenance scenario as an example

An SRE who cannot build a data asset system is not a good maintenance person.Picture

The above figure uses the scenario as an example to introduce the data Classification of assets. To understand data assets, you need to understand the three elements of data assets, namely the correspondence between data type, data form and data carrier.

  • Data type: information description of operation and maintenance characteristics

At the business indicator level, SRE focuses on transaction time, transaction order volume and other information; at the operating software level, SRE focuses on users IP, interface call status and other information; at the infrastructure level, it pays attention to the corresponding network packet loss rate, memory usage or CPU usage; going deeper, SRE will pay more attention to data such as change events, the number of pilot releases or emergency changes.

  • Data form: the form in which data is stored in the data carrier

We choose the corresponding storage method based on the different expression forms of log, relationship, and monitoring data, etc. For example, relational database, persistent database, message queue or log file, etc.

  • Data carrier: Provides a storage method for operation and maintenance data

3. Data assets - enhance the value of SRE

An SRE who cannot build a data asset system is not a good maintenance person. Picture

Based on the obtained operation and maintenance data, first build an asset-based platform, such as the CMDB mentioned later. Use these platforms to decompose and manage a large amount of operation and maintenance data according to consumption scenarios, thereby realizing assetization.

In addition, we can use the digital asset platform to quickly establish and improve platforms related to SRE stability, such as SLO and capacity management platforms. Once the platform is successfully established, we will continue to explore the potential value of data and improve the stability that SRE focuses on.

2. Data Governance-Methodology

1. Problems faced by operation and maintenance data standards

An SRE who cannot build a data asset system is not a good maintenance person.Picture

Operation The problems faced by dimensional data standardization are similar to the problems of data quality in big data scenarios. They mainly include data islands, low data quality, unknowable data, insufficient data services, and long development time to obtain data.

These problems make it difficult to quickly iterate data consumption scenarios and fail to meet business needs. When human resources, server resources, middleware resources, etc. are insufficient, data standardization construction will have a catastrophic impact.

Operation and maintenance data are inherently non-standard. For example, the data storage methods of logs and log monitoring are different. And we need to maximize elaboration and complete standardization under limited resources.

Regarding recent popular concepts in the industry, such as DataOps, AIOps and other models or scenarios, we still lack a mature and comprehensive data modeling methodology.

2. Establish an operation and maintenance data governance model

Promoting operation and maintenance data into data assets needs to focus on three parts: governance methods, governance processes and technology platforms.

An SRE who cannot build a data asset system is not a good maintenance person.Picture

1) Governance method

  • Master data management: Define and split the data that SRE focuses on. For example, data such as hosts and CLP can be used as master data, and we perform life cycle management on them.
  • Generalized metadata management: These data enter the CMDB in the closed-loop reporting process, which is generalized metadata management. Represented by the CMDB model, corresponding data support is provided to the upper level.
  • Key governance links: Based on the three dimensions of data standards, governance quality and security baseline, sort out the entire governance link, that is, data standards, quality goals, and baseline requirements for the entire change.

2) Governance process

The governance process includes strategy, construction and operation. In terms of overall construction, it is necessary to build platforms and tools to assist its own operations.

3) Technology platform

The main purpose of establishing a technology platform is to support existing and incremental data through tools.

3. Focus on key elements of data governance

The key elements of data governance mainly focus on four aspects: organizational guarantee, system construction, project implementation and platform support.

  • Organizational guarantee: In order to solve human resources problems, we clarify the roles and division of responsibilities of members. A dedicated data governance team is composed of three roles: product, operations, and R&D.
  • System construction: It is necessary to build standardized processes and ensure their orderly implementation, such as resource access, resource development, resource data model and other specifications.
  • Project implementation: Start overall special management. Data governance is a long-term process, not a simple campaign. If the data quality is seriously not up to standard, we will set up a special team and adopt a mobile approach to urgently repair data quality problems. However, establishing long-term governance means requires outputting corresponding governance methodologies based on data products, and implementing them into productized platform means to drive data responsible parties to conduct data governance.
  • Platform support: Platform construction mainly focuses on fine measurement, execution and governance efficiency and other dimensions.

3. CMDB platform construction

1. CMDB configuration management library

An SRE who cannot build a data asset system is not a good maintenance person.

CMDB configuration management office mainly focuses on four aspects Construction: basic registered technical ledger, detailed natural attributes, natural relationships, and resource consumption map. We need to build models corresponding to the business in layers, and then push configuration dynamics in real time through automated sensing or standardized processes.

The corresponding configuration also needs a corresponding visual interface to stimulate collaboration. Ultimately, these data promote data consumption scenarios through APP or corresponding offline scenarios.

2. The positioning of CMDB in the ITIL era - metadata center

Personally understand that CMDB is a metadata center. As shown in the figure above, our configuration management database CMDB will clean or assemble data related to organizations, personnel, decisions, permissions, processes, etc.

There are many lower-layer docking platforms, such as monitoring platforms, emails, text messages, operation and maintenance databases, etc. After the data is assembled, it will be handed over to the upper layer (a platform similar to the service management layer) for data output, a series of services such as asset management and configuration management, and platform construction.

3. The positioning of CMBD in the new era - application-centered

An SRE who cannot build a data asset system is not a good maintenance person.

Application-centered, it can realize the association of organization-project-personnel relationship and bound to the application.

During the application running process, the corresponding resources (server resources, configuration center, observability indicators, etc.) are used, and then a subordinate relationship is formed according to the company's organizational structure. Finally, the organizational structure perspective is referenced to the microservice perspective to form Resources and their relationships - topology, including application topology and physical topology.

4. Advantages of application-centric CMDB

An SRE who cannot build a data asset system is not a good maintenance person.Picture

5. The relationship between the application and the metadata center during runtime

An SRE who cannot build a data asset system is not a good maintenance person.Picture

The above picture shows the CMDB, which will provide the metadata of the basic test facilities, Paas related data and operating data to the upper layer (CI platform, CD platform, service operation platform and service operation platform), the lower platform shown in the figure forms a service resource support platform.

The advantage of this kind of construction is to provide basic data support for the entire life cycle of the application, including application creation, application runtime (build, release, expansion, billing), and recycling resources after the application is offline.

6. The four major stages of CMDB construction

An SRE who cannot build a data asset system is not a good maintenance person.Picture

The picture above shows the four major stages of CMDB construction. We are currently in the process of The fourth stage from service orientation to value orientation.

Department orientation:

  • Regardless of whether there is a CMDB system or not, there is actually a CMDB requirement, and configuration information is maintained based on departments;
  • The information is isolated and not timely, completeness and correctness cannot be guaranteed.

Data-oriented:

  • The data and interrelationships that all departments are concerned about are unified into CMDB management, and a configuration management process system is established;
  • Due to unclear consumption scenarios, there is an imbalance between consumption value and production cost.
  • The data production cost of station B is not very high, but there are a lot of data consumption products to build, or the business side often customizes the scene requirements, and the CMDB needs to be customized and involved in development to complete the business side demands. This exposed the problem. The CMDB has more than 300 OKACIs, which is inconvenient to maintain.

Scenario-oriented:

  • The degree of local data standardization and high accuracy;
  • Due to the single use scenario, the overall consumption value is not high, and the production cost Relatively high.

Service orientation:

  • Data supply services support daily operation management and control, such as automation, monitoring, workflow management, operation and maintenance analysis, etc.;
  • Introduce diversified data production/consumption methods to gradually balance consumption value and production costs.

Value orientation:

  • CMDB fully supports services and business development, such as service capacity management and availability management, becoming the cornerstone of IT operation and maintenance;
  • Proactively promote the improvement of the organization's IT management level.

7. How to build a CMDB model

An SRE who cannot build a data asset system is not a good maintenance person.Picture

  • Define data types: including hosts, switches, applications , application configuration file, configuration personnel will investigate this after receiving the demand.
  • Define data core attributes: Taking the host as an example, you need to report or collect the core attributes of resources such as IP, serial number, computer room, and cloud vendor.
  • Build direct relationships in the data model: sort out the correspondence between resources, such as inclusion relationships, dependency relationships, running relationships, etc., to facilitate the subsequent production of resource topology. For example, if the application uses one data type and the host uses another data type, then the application will depend on the host when running, and the host in turn can form the application.
  • Consumption scenario confirmation: Confirming the consumption scenario means confirming which stages the data is used for. If it is used for cluster deployment, you may need to perform related deployment in the application dimension or corresponding operation and maintenance tasks.
  • Establish data specifications: What is the life cycle (from creation, production to deployment)? How does the platform detect changes in data status?

To sum up, we should take the entire life cycle of data as the starting point, determine attributes, clarify relationships, clarify consumption scenarios, and use automated processes to ensure the real-time and accuracy of data.

1) Model relationship definition

An SRE who cannot build a data asset system is not a good maintenance person.Picture

2) CI relationship DEMO example

An SRE who cannot build a data asset system is not a good maintenance person.Picture

3) CMDB implementation framework

  • Current situation assessment: Is there currently a CMDB platform? How established is this platform? What is the quality of this data? What is the organizational structure and technical structure? What is the status of the resources that will be needed during the future launch process?
  • Project startup: When starting, you need to define the CI model and relationship of access resources, later consumption scenarios, data sources, and CI stakeholders.
  • Data instantiation: When performing data instantiation detection, a test environment will be built and CI models or instantiated data will be imported.
  • Data verification: In the UG environment, check the comparison between data reporting and actual output to confirm whether the data quality meets the standards. After the data quality reaches the standard, a production environment needs to be built to detect the status of the data in the production environment.
  • Data scenario consumption: After the data falls into the production environment, we need to check the data consumption scenario. We need to connect with the operation platform or SRE platform.

4) Standardization first

Standardization first means that all matters before implementation are built around standardization. These include some strong requirements, such as planning requirements, process requirements, organizational requirements and platform requirements.

Specification requirements:

  • Clearly define the role of the CMDB platform and the relationship between other business systems;
  • Clearly define the resource management process, responsible persons and responsibilities Platform;
  • Clearly define the baseline standards of resources and deviation management methods;
  • Plan and build configuration management capabilities from the perspective of service business scenarios.

Process requirements:

  • Can truly reflect the status of resources;
  • Can completely include all resource information and relationships between resources;
  • The only authoritative data source in the world;
  • The data can Obtained conveniently, timely and efficiently by users and systems.

Organizational requirements:

  • Establish a unified configuration management capability building body;
  • Each business team clearly responsibilities for configuration consumption and improvement;
  • Form a mechanism for configuration management discussion, optimization and requirements collection.

Platform requirements:

  • Gradually realize automatic configuration discovery and automatic maintenance;
  • Real-time tracking of resource status and configuration changes;
  • The model is flexible and can be expanded and adjusted in real time according to business needs;
  • Configuration visualization can support the analysis and rapid location of resource problems.

5) Create a closed-loop data life cycle

First, determine the application attributes. The attributes of the application may include the application's Chinese and English name, application level, unique ID, attributed business and business domain, etc. The content of the attributes mainly depends on personal definition. After the application is defined, the application may have a relationship with other CIs and needs to be further sorted out.

Secondly, clarify the person responsible for the properties of the application. Applications have corresponding responsible persons, R&D, SRE, etc. We have corresponding processes for application construction, release, changes, and other actions around users to ensure application configuration and change review.

Finally, perform scheduled collection tasks to ensure the final data accuracy of the application.

6) Promote automatic discovery and update of configuration

The "resources" mentioned in the above figure are still resources in the traditional sense, such as server resources. These resources are collected through a certain method and finally reported to the resource management platform.

  • Build a complete configuration collection capability to eliminate manual maintenance scenarios;
  • Automatically discover resource and application configuration information;
  • Docking process, management platform and equipment , obtain and update configuration status in real time;
  • Establish resource configuration and usage specifications, and conduct compliance checks through CMDB;
  • Promote the realization of configuration consumption closed loop, and automatically maintain data reliability through consumption feedback.

The above is the detailed content of An SRE who cannot build a data asset system is not a good maintenance person.. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete