Home  >  Article  >  Technology peripherals  >  GenAI is shaping the future of incident management processes

GenAI is shaping the future of incident management processes

WBOY
WBOYforward
2023-11-17 19:47:17474browse

GenAI is shaping the future of incident management processes

Although the majority of respondents (59.4%) have clear incident management processes in place and the level of automation meets their needs (71.1%), companies are still struggling There has been a surge in incidents and efforts are still being made to process them quickly.

66.5% of companies reported an increase in the frequency of incidents affecting their customers in the past 12 months, an increase of 3.6% from the 2022 survey.

According to 63% of respondents, these downtime-causing events (e.g., application outages, degraded service quality) put companies at risk of losing an average of up to $499,999 per hour, an increase of nearly 5% from 2022. %. 46.6% of respondents also said that downtime costs range from US$100,000 to US$2 million.

Companies find current incident management ineffective

Research points to GenAI as a means of solving existing problems in incident management, with 84.5% either believing AI can significantly streamline their Incident management processes and improve overall efficiency, or get excited about the automation opportunities AI offers for certain aspects of incident management.

“The insights we uncovered in our research highlight the urgent need for adaptive, LLM-based automation that goes beyond mere task repetition and instead dynamically incorporates cues and context in real time. Adapt to the changing environment,” said Divanny Lamas, CEO of Transposet.

The needs of modern operations teams have surpassed traditional, rules-based automation tools. While many companies have well-established incident management processes in place, as the number of incidents continues to increase and the impact on customers and finances becomes increasingly significant, a transformative approach is required. Therefore, adopting innovative solutions like GenAI is the way forward. This approach, augmented by automation and guided by human judgment, not only speeds up incident processing, but also proactively identifies potential issues before they escalate and takes preemptive measures.

In the field of incident management, reliability Engineering teams face huge challenges. 73.9% of reliability engineering leaders encountered various obstacles when handling incidents, including brittle automation scripts (59.7%), cumbersome manual processes (47.8%), and difficulties in accessing expertise (47.2%)

Additionally, 42.5% of companies said their current incident management processes are ineffective or only available to some teams due to confusing documentation (41.3%), limited tool availability (40.4%), and reliance on institutional knowledge (39.7%) Members use.

Over the past year, 61.5% of companies said the time required to handle incidents has increased. Additionally, 79.8% of companies said it takes up to six hours on average from first alert to resolution. In addition to increased incident resolution times, assembling the right team members adds an additional layer of complexity. 71.3% of respondents said the process could take up to 30 minutes

Additionally, a significant number of team members found it challenging to understand and routinely apply company-defined procedures matter. 37.4% of companies reported that only selected team members fully understand the defined incident management process and adhere to it consistently.

Barriers to automation increase incident complexity

Companies need to address inefficiencies in incident handling and overcome barriers to implementing automation. 33.3% of respondents stated that only 11%-25% of their incident management tasks or workflows are automated, indicating that companies have an opportunity to implement more automation in the incident management process

After in-depth After research, respondents showed strong interest in automating key aspects of the event lifecycle. Among them, 50.0% of the respondents are concerned about event settings, 44.2% of the respondents are concerned about the communication protocol, 30% of the respondents are concerned about the investigation process, and 29% of the respondents are concerned about the remediation measures

Despite the implementation There is interest in automation, but respondents cited four major barriers to achieving automation:

  • Not enough support from leadership or management (57.1%).
  • Insufficient knowledge sharing (54.3%).
  • Insufficient documentation of institutional knowledge and existing procedures (54%).
  • Not sure what to automate (52.4%).

Companies are able to automate faster when using SaaS tools. According to the survey, 74.6% of respondents use SaaS tools, and 82.0% of respondents confirmed that they can automate operations without writing code. Additionally, 84.3% of respondents said it took only 11 minutes to an hour, demonstrating the effectiveness of SaaS solutions in incident management.

Companies employ AI-based applications and automation tools to Enhancing the technology stack

In the next 12 months, 72.1% of teams expect to expand their technology stack. To enhance incident management processes and reduce mean time to resolution/repair (MTTR), companies plan to implement new tools, including:

  • Tools or applications based on AI or ML (60.0%).
  • Automation tools or applications (53.1%).
  • Communication/collaboration tools or applications (48.1%).

SRE and platform engineering play a vital role in enabling AI and automation. Over the past year, 61.5% of respondents have increased their focus on SRE practices and plan to hire more field reliability engineers, while 57.5% have stepped up their platform engineering efforts and plan to introduce more platforms engineer. These strategic moves underscore the company's commitment to strengthening its incident management capabilities.

Survey results point to a clear path forward for the incident response lifecycle, highlighting the need for a SaaS tool or platform that seamlessly integrates all incident management tools used by companies to leverage human data insights and leverage GenAI to improve operational efficiency and decision-making.

AI Reshapes Work Experience

90.4% of respondents believe that systematically mining insights from human data (such as archived slack communications, retrospective interviews, group feedback, etc.) can Improve future incident response and improve operational quality, however, 90.2% believe automation should allow humans to use their judgment at key decision points to make it more reliable and effective, an increase of nearly 10% from the 2022 study .

89.8% found that integrating GenAI capabilities into an incident management tool or platform reduced the time needed to create new automations, freeing up time for other high-value work. 96.3% believe it would be beneficial if all the tools their company uses during an incident were integrated through one tool or platform.

For 79.5% of companies adopting AI in their technology stack, the impact is significant. 51% of people believe that artificial intelligence is making their jobs better, showing improvements in human working lives. 63.5% use AI to improve the accuracy and quality of data, and 50.7% of respondents reported resolving incidents faster. 49.4% use AI to identify root causes of issues, potential threats and vulnerabilities more quickly and easily, and 48% use it to automate repetitive tasks or processes, effectively streamlining their operations

Lamas concluded: “Considering the ever-changing needs of modern operations teams, it’s clear that what these teams need is an LLM-based adaptive automation and incident management solution. This unified, intelligent approach not only streamlines processes; This enables teams to leverage automation and artificial intelligence to enhance the company’s incident management processes, further developing more efficient automated workflows. This approach is becoming increasingly important for seamless incident resolution and reduced MTTR by ensuring humans remain actively involved. Ultimately, it allows teams to focus on what really matters – delivering efficient and effective solutions to complex problems

The above is the detailed content of GenAI is shaping the future of incident management processes. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete