Home  >  Article  >  Technology peripherals  >  AI-driven SLS technology with innovative intelligent analysis capabilities

AI-driven SLS technology with innovative intelligent analysis capabilities

王林
王林forward
2023-12-19 20:34:121227browse

AIOps brings revolutionary changes to operation and maintenance work

With the continuous upgrading of cloud computing technology, the scale of IT infrastructure carrying business continues to expand, the link relationships between various applications become more and more complex, and a large amount of log data is generated. The collection, storage, analysis and processing methods of log data have become an important indicator of the degree of digitalization of enterprise systems. Traditional IT operation and maintenance solutions also face huge difficulties when facing these challenges. For DevOps, solving a problem may take hours to find, compare, and analyze. You need to review various logs, monitoring data, and other related information to find the root cause of the problem. For SecOps, conducting in-depth analysis in massive amounts of data means they need to quickly dig out root causes and find anomalies from hundreds of terabytes of data. This process is very time-consuming and cumbersome, and may require a large investment of manpower and resources

In order to solve the above problems, a new generation of AIOps solutions needs to be introduced. This solution realizes automation and full-stack data link observability through data fusion analysis, and provides easier-to-use reports and diagnostic rules, making what you see is what you get. With the support of AI technology, anomalies can be automatically detected more efficiently and root causes can be quickly located. AIOps has brought revolutionary changes to operation and maintenance work

AI Powered SLS 智能分析能力创新

How does Log Service SLS improve efficiency?

SLS automated full-stack implementation of data collection

  • Cloud infrastructure observable Alibaba Cloud Lens: Provides cross-account, cross-region, and unified cloud product operation and maintenance data collection, and supports automatic collection of measurement, indicators, access logs and other data
  • Application observability Full-stack observability: full-stack data collection, client to server, infrastructure to application, data correlation analysis, across multiple data sources, complete analysis syntax, rich context support
  • Security audit Log audit: 50 data sources are automatically accessed, security situation visualization chart, built-in 100 security rule alarm monitoring, providing multi-account management, cross-account, and cross-region collection centralized storage

SLS provides out-of-the-box reports and diagnostic rules

  • CloudLens built-in rules: comprehensive cloud product-assisted operation and maintenance analysis, supporting flexible subscription data platforms such as consumer groups/API/Grafana
  • Full stack observable built-in alarms: real-time alarms, event management system, alarm convergence, customizable dashboard, built-in anomaly detection and root cause analysis
  • Security Built-in rules: Meet compliance, MLA, Cybersecurity Law, GDPR and other standards, with nearly a hundred built-in security compliance monitoring rules

SLS launches an open and compatible data ecosystem

  • SLS provides an open and compatible data ecosystem, compatible with multiple data sources, and unified collection.
  • SLS is compatible with the open source, cost-effective observable storage analysis platform. Built-in serverless analysis capabilities, compatible with open source engines and tools, compatible with Elasticsearch, Kafka, Prometheus, CK, and seamless migration in 99% of cases.
  • SLS is the best solution for offline data warehouses and data lakes. It integrates with third-party SIEM to provide SecOps cloud security auditing, and supports multiple alarm notification channels.

AI Powered SLS 智能分析能力创新

Basic model innovation for IT operation and maintenance scenarios

Alibaba Cloud Log Service (SLS) is committed to building efficient and observable operation and maintenance solutions. With many years of operation and maintenance experience and the support of large language models, SLS continues to improve its competitiveness in this field. Recently, SLS released a basic intelligent operation and maintenance model, covering observable data scenarios such as logs, tracking, and indicators, and supporting functions such as anomaly detection, text segmentation annotation, and high-latency analysis of tracking requests. The model provides plug-and-play anomaly detection, automatic annotation, classification and root cause analysis capabilities. In a production environment, it can locate the root cause within seconds within thousands of requests, with an accuracy of over 95%

In addition, SLS provides manual-assisted fine-tuning. On the log service platform, it natively supports annotation feedback for Log, Metric, and Trace, allowing customers to quickly annotate and correct during use to accumulate data sets that meet specific scenarios. Through the platform's annotation capabilities, customers can accumulate high-quality operation and maintenance data labels from scratch, providing unlimited possibilities for future root cause diagnosis model training. In the future, customers can fine-tune models in specific fields for their own annotated data, quickly deploy them, and create private model services. This function supports automatic annotation and manual-assisted fine-tuning, and also supports the correction of manual annotation results. The model is automatically fine-tuned based on manual feedback to improve scene accuracy

SLS becomes an important intelligent assistant by assisting in generating query statements. Released Alibaba Cloud CloudLens Copilot large model to help cloud facility maintenance and operations. Using NL2Query technology based on large language models to accurately understand the user's query intentions and improve the accuracy of query results; there is no need to understand complex SQL language and query syntax, and it can accurately convert natural language queries into SQL queries and visual charts; establish scenario-based Knowledge graph, continuous learning, continuous optimization of model adjustments and knowledge base updates, and continuous improvement of the accuracy and effect of question answering

AI Powered SLS 智能分析能力创新

Scenario Example: Intelligent Anomaly Analysis Detection and Root Cause Analysis

We propose a solution for scenarios with complex calls and dependencies in the game service system. We use the Trace data in the service to automatically generate topology maps, and conduct analysis and diagnosis of high latency analysis, high error rate analysis, system hotspots and bottlenecks, etc., to shorten problem processing time and optimize system latency

Through the automatically generated topology map, we can quickly locate the root causes of abnormalities and performance bottlenecks in massive Trace data without manual intervention. This method can improve the efficiency of abnormal location in large-scale distributed systems and achieve root cause location at the level of thousands of requests per second. In a production environment, the accuracy of this solution can reach 95%

AI Powered SLS 智能分析能力创新

Intelligent operation and maintenance basic model

Traditional AIOps technologies, such as anomaly detection and root cause location, have the following two main problems:

  • AIOps algorithm involves the configuration of many thresholds and rules. These configuration items require repeated testing and selection in different business scenarios. Therefore, the maintenance cost of the algorithm is relatively high, and it is difficult to evolve with changes in business scenarios
  • AIOps models are generally built using private domain data, which often suffers from the problem of small data quantity and poor quality. This results in poor generalization and migration capabilities of the model, which often need to be rebuilt in different business scenarios

In response to the above problems, SLS has now launched a universal model capability for intelligent operation and maintenance. We have developed basic models for analyzing logs, tracking information, and indicator data respectively, and provide out-of-the-box anomaly detection algorithms, root cause analysis, and automatic labeling. Our model is able to locate root causes in seconds across thousands of requests, with over 95% accuracy in production environments. For different data types, we choose different tasks for pre-training

  • Metric basic model: can be prepared to identify timing anomaly detection, timing prediction, morphological detection, etc., to assist in more intelligent inspections
  • Log basic model: For log scenarios, it provides rich LogNER capabilities and assists in extracting log templates with semantic information)
  • Trace basic model: supports high-latency diagnosis of Trace data of OT protocol

Products with basic models in specific fields can be used immediately without cumbersome deployment processes. You can start using them with just one click, thus greatly lowering the threshold for customers to use the basic functions of Log Service. Customers do not need to fine-tune the model in specific scenarios. They can simply use the general basic model provided by Log Service directly to obtain good results

Alibaba Cloud Lens Copilot large model assists infrastructure maintenance and operations

Alibaba Cloud Intelligent Lens Copilot provides support for cloud facility maintenance and operations through powerful models, effectively solving the problems faced by users in terms of unfamiliarity with SLS syntax, lack of business domain knowledge and high-quality question and answer corpus

  • Accurately identify intent: Use NL2Query technology based on large language models to accurately understand the user’s query intent and improve the accuracy of query results
  • WYSIWYG results and reports: No need to understand complex SQL language and query syntax, accurately convert natural language queries into SQL queries and visual charts
  • Automatically learn asset data: Integrate asset data and knowledge graph in Alibaba Cloud Lens, continuously learn asset data, and automatically optimize model adjustments

Summarize

With the improvement of AI capabilities, SLS’s intelligent analysis capabilities will be comprehensively improved. SLS aims to leverage data and algorithms to support AIOps innovation with the following benefits:

  • Easy to use

Customers can easily use functions such as indicator anomaly detection, intelligent word segmentation of log text, and trace link high latency diagnosis on the Log Service console, allowing customers to experience the ubiquity of models

Basic models in specific fields have been prepared in advance and can be used directly, eliminating the tedious deployment process and only need to click once to start

The large language model in specific fields launched this time can significantly lower the threshold for customers to use the basic capabilities of the log service, so that the large language model can assist in generating query statements and become an important intelligent assistant

  • Flexibility

1. Customers do not need to fine-tune the model in specific scenarios, and can obtain good results by simply using the general basic model provided by Log Service

On the log service platform, it natively supports the annotation and feedback capabilities of Log, Metric, and Trace, allowing customers to quickly annotate during use and accumulate data sets that meet specific scenarios

  • Scalability

With the powerful computing power support of Alibaba Cloud, the basic general model provided by the Log Service can achieve rapid expansion and service migration

In the future, customers will have the ability to fine-tune domain-specific models and quickly deploy them in parallel to create private model services

Original link: https://developer.aliyun.com/article/1396326?utm_content=g_1000386345

Please do not copy or reprint this article. This article is the original content of Alibaba Cloud and may not be reproduced without permission

The above is the detailed content of AI-driven SLS technology with innovative intelligent analysis capabilities. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:sohu.com. If there is any infringement, please contact admin@php.cn delete