The transition from reactive to proactive monitoring can in some cases be a real challenge for IT organizations. This usually involves a lot of tuning of thresholds and alerts to give an early indication if a resource might fail. It can also increase the alert noise in your environment and to succeed with the proactive initiative, more IT staff might be required to work with all the new alerts and indications to prevent a possible service failure.
To deliver a successful implementation of proactive monitoring with the increased amount of events generated from growing numbers of devices each year, another approach needs to be considered.
While a lot of organizations strive to implement proactive monitoring, some of them unfortunately fail due to the extra workload required on the IT department that is already under a lot of pressure. This is a common fear of a business manager when considering a proactive approach.
Getting better within the same process is, in a lot of ways, much easier because your objective is to get better at what you’re already doing.
Working proactively changes the way you work and drives your organization to act on problems before they might even happen. This means that the people involved are required to change the way they work, and in return they will be able to plan their work better and reduce potential downtime.
Activating more alerts and changing thresholds might increase your work payload in a negative way and it can sometimes drive people away from being more proactive. But what is the alternative? Stay in the reactive phase - your organization will still suffer from downtime and your work will only reduce the time it will take to fix the issue.
So isn’t it more important to prevent the failure from never happen?
Instead of enabling more alerts and reconfigure a lot of your thresholds another possibility is to use some smart measurements and filter out events and performance data that is of no relevance to you.
Finding a needle in a haystack
Machine data generates millions and millions rows of data related to servers and application components. By looking at the past, determining what is the normal behavior and then filter out the data that is moving outside of baseline. Calculating the deviation value between current position and the baseline gives you a really good understanding of what could be the root cause of the problem. Read more about how to detect abnormal behavior with baseline filtering.
Trying to find data that peaks at a certain time of the day and what effect that might have to other components is like trying to find a needle in a haystack. This can be achieved by looking at the deviation between the average value and the max value of a counter. The longer the distance, the bigger the peak.
By putting performance and event data into time buckets and relate them to a service you can also look for other counters with the same patterns inside that service and correlate issues that might have to do with performance or alerts. When pre-calculating forecasts of data, you can also predict upcoming issues by looping through future values to detect these patterns.
IT Service Analytics
When putting all of this together with your IT service in focus the information and responsibility can be delegated to different teams or service owners. The calculations described above can also be used as important key performance indicators and give valuable insights.
"Differing from traditional proactive initiatives, an automated analysis can significantly improve how your IT operations delivers quality, without the extra overhead needed."
Turning data into knowledge
IT Service Analytics is a plug ´n play business intelligence platform for Microsoft System Center. It enables your IT organizations to make qualified decisions based on intelligent and accurate information gathered throughout your IT landscape.
With advanced analytics and statistical capabilities from data already collected, IT Service Analytics will turn your data into knowledge. Combined with your IT service management processes data you get a comprehensive toolset to keep you one step ahead of the business demands while delivering high quality IT services.