Implementing AIOps in 5 simple steps

IT Operations teams have faced mounting challenges as the tech stack has both grown in size and complexity. By Michael Procopio, Product Marketing Manager, Micro Focus

  • 3 years ago Posted in

Now having to oversee data distribution and accessibility, applications, varying interdependent infrastructures, customer touchpoints and more, keeping these plates spinning has become a monumental task. Ticket management alone can place overwhelming pressures on teams, where sifting through the sheer noise can be incredibly resource-intensive. Fortunately, for IT Operations teams, IT management solutions have matured in line with these challenges. 

Around 2015, the idea of IT Operations Analytics (ITOA) crystalised. The application of big data analytics to raw IT operations data unlocked the ability to analyse huge amounts of information. This would, in principle, allow for more informed and focused decision making. At the time, however, the tools for doing so were cost-prohibitive and highly specialised. 

Fast forward a couple of years and deep learning, machine learning and AI are at the top of the Gartner hype cycle and AIOps enters the lexicon of cutting edge CIOs. The idea of combining AI, machine learning and automation with big data analysis on this ITOA data frames AIOps as the ideal extension of the ITOps team. Although at the time, the “how” was somewhat missing. 

Now, with commercialised intelligent systems and white-label solutions widely available, AIOps helps leading enterprises manage their IT estates and streamline workflows. With this market maturation, we’ve seen more technical deployments such as “multi-domain AIOps”, differentiating between domain agnostic vs domain specific solutions, and the concept of an AIOps overlay of existing tools emerge. 

At its core principles, however, AIOps always comprises of three tenants: the ability to Observe estate performance, Engage with IT personnel, and Act through automation and remediation capabilities. With these abilities, vast, complex, hybrid environments are able to be managed relatively simply.

And yet, despite its proliferation and market growth, adopting AIOps solutions still sounds like a job for an Oxford data science graduate. It isn’t. With the right provider and partner, the intelligence should be built in. All IT teams need are the operational know-how and management skillsets so commonly found within the department.

So here’s how to get started:

Adopt with purpose

Aligning with business goals and addressing well-known pain points is key. AIOps has the habit of coming across as either marketing-speak or nebulous scientific tinkering rather than an essential pillar of IT operations. Research from EMA found mature AIOps supports on average 8 different domain specific roles and 11 cross-domain roles. In other words, it can quickly seem unwieldy. 

This can make pitching to purse holders extremely challenging on first look. However, identifying readily addressable points of friction as small scale pilots can make executive buy in easier. With these in hand, teams can pre-determine success criteria and give a quantifiable, projected figures or ROI. 

Event noise reduction is a common starting point that can have a significant service impact. Large enterprises are inundated with IT events, so employing AIOps to help filter, manage and eventually remediate tickets can provide a relatively simple, yet highly impactful win.

Enable observability

The ideal state of AIOps is for it to be interwoven with the entire IT estate. Every system, subsystem and even end node wants to eventually be connected, brought into the automated analysis and remediation fold. 

There are three reasons for this. Firstly, the more data, the more the system can learn. It needs to be able to read and understand patterns, which allows it to become more proactive and independent of the ITOps team – allowing for them to prioritise complex issues that need their attention. Secondly, faults or issues may otherwise lie beyond the system’s reach. IT errors typically cause a chain of events, having your system only know half of the picture adds limited value. Lastly, and related to the previous point, issues caused by foundational systems may only come to light further up the tech stack – remediating this requires system-wide reach in order to find the root cause.

Domain agnostic AI

It is common these days for domain tools to come with built in intelligent capabilities. Yes, these have their immediate uses, but they typically operate within a silo. They fall short of true AIOps.

As mentioned earlier, there are often root causes to symptomatic events that are many times removed. In short, single domain, or domain-specific AI has relatively significant constraints compared to their domain agnostic counterparts. 

With agnostic systems, you can achieve what is called a first-time fix. This means that the AI system has pointed ITOps straight to the root cause or ideally fully remediated the issue itself. One Micro Focus customer experienced a 118% percent uptick in first-time fixes after rolling out the appropriate, domain agnostic tool. As you can imagine, this translates to huge time and resource savings.

Centralising the data

A common method of enabling observability and expediting implementation involves creating a central data lake. AIOps requires diverse and wide ranging data sets in order to function effectively. This includes both historical data, to train the system, know what is normal and what is not, how to react and how not to, along with current live data in order to react to, remediate and even ultimately predict issues.

While many organisations have their estate domains covered with some form of intelligent system, this is more often than not siloed. Data lakes resolve and prevent this walling off while providing additional advantages. Firstly it is easy to develop single-pane dashboards that give ITOps teams the same level of observability that the system has (fig. 1). Secondly, it can simplify implementation and roll out; new pilots and projects are simpler to deploy, and scalability is more achievable.

 

From observe to action

AIOps can often be described in only passive terms, i.e. what it can see and tell you. The next big step in realising its full potential is to add that final Act characteristic. Embedding AIOps into workflows to automate and remediate is where the true value comes in. The impact this can have is to orders of magnitude. Another MicroFocus customer saved over $1m in their first year, rising to $4M per annum by year three with 95% of remediation incidents completed without any human interaction. 

The start point for any AIOps journey will vary dependent on business priorities, and the order in which these steps are addressed may change but these are the boxes you’ll need to check.

By Dael Williamson, Chief Technology Officer EMEA at Databricks.
By Ramzi Charif, VP Technical Operations, EMEA, VIRTUS Data Centres.
Companies are facing a Catch 22 when it comes to the need to invest in new forms of AI, whilst...
By Mahesh Desai, Head of EMEA Public Cloud, Rackspace Technology.
By Narek Tatevosyan, Product Director at Nebius AI.
By Mazen El Hout, Senior Product Marketing Manager at Ansys.
By Amit Sanyal, Senior Director of Data Center Product Marketing at Juniper Networks.
By Gert-Jan Wijman, Celigo Vice President and General Manager, Europe, Middle East and Africa.