In business intelligence reporting, a common area is around learning your operational performance. This means tracking operation’s workload and results. While this can be a sticky subject for operations, its also a great opportunity to improve. Its a fact, when overloaded, operations suffers in the quality of their response. So its only common sense to track the NOC like you track the network. If operations is overload causing quality issues, operations need to be aware so that remediation actions occur. This could include staff augmentation or improved training regimes to drive better results. Trouble becomes how. Many focus on ticketing solutions. The ITIL compliance allows management of operational performance to set specifications. But those levels are not real-time. How does it help to know you needed help last Wednesday!
Where Machine Learning Coming Into Play
Again, ML/AI technology helps. Fault managers can track user and automation interactions with those faults. Most call these “event managers”. This audit trail can have machine learning applied to create the standard operational model. The result is a discovered model. Say a common fault usually takes 10 actions and 15 minutes to fix during business hours. When the NOC deviates from their previous score – good or bad. The AI can alert to the group, either GREAT JOB improving here is the new bar or let’s RALLY we are getting behind.
Proactive Workload Management
Let’s get into the details. Let’s say that machine learning exposes that during a certain time of day/day of week, 4 level1, tickets 5 level2 tickets, and 15 level3 tickets. Then the system is showing a systemic increase 2x, then 5x, and then 10x. AI agents can see this risk and alert. That alert can show that we have an abnormal amount of tickets opened. Operations managers can call in resources. The system can send an advisory email to the ticketing administration asking for a health check. Without ML/AI technology, running reports and interpreting requires so much time, most organizations will not even try. Those that do, latency could be weeks between needing a change to recognizing that need.
Positive Impact to Operations
The results of operational performance monitoring should be a smoother working operations teams. Fewer errors and happy customers is what every NOC should try to provide to the organization. Accomplishing this with zero human touch with a latency of less than 15 minutes. This has been unimaginable functionality up to this point. The difference has been the emergence of ML/AI technologies.
Let me know what you think here in the comments below. This is a cringeworthy conversation with operations. I do believe that near real-time operations performance management has value to NOCs today.
About the Author
Serial entrepreneur and operations subject matter expert who likes to help customers and partners achieve solutions that solve critical problems. Experience in traditional telecom, ITIL enterprise, global manage service providers, and datacenter hosting providers. Expertise in optical DWDM, MPLS networks, MEF Ethernet, COTS applications, custom applications, SDDC virtualized, and SDN/NFV virtualized infrastructure. Based out of Dallas, Texas US area and currently working for one of his founded companies – Monolith Software.