The technology of machine learning with AI should be our first focus. As with all new technology it has new terms and new concepts. Several of these are heretical to the status quo. Its important to set a proper context so we can have serious discussions.

Good to my word, here is the first blog in the series on machine learning with AI in service assurance. To explain some of the solutions in the marketplace, first we need to talk about the technology. The terms and concepts are new. The goals are the same for operations: increasing automation, increasing quality.

Defining Machine Learning

Let’s start on machine learning. First check out the wikipedia article. Below is the definition:

“Machine learning is a field of computer science that often uses statistical techniques to give computers the ability to “learn” (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.”

I summarize it like an ant farm. The farm, made by worker ants, creates the tunnels. Once done the model is complete. In this case, ants are the machine learning. As days go on and if necessary the ant farm will change. Then, say the farm falls over (oops). So the ants have to re-build and new pathways in the model, updating them. The fact is the ant farm is a always changing and learning from the environment. Like the ants, machine learning builds and maintain the patterns, or as I call it models. These models compare to the current situation in real-time. Artificial intelligence can see if they align (chronics) or diverge (anomalies).

Defining AI

Now AI as defined by wikipedia as “the study of intelligent agents: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.“

The complexity of AI is a spectrum. On one side some AI is cognitive, virtual reasoning. On the other, AI is nothing more than rules processing. When in the context of machine learning, AI is usually applying or comparing models. This is what machine learning has learned with a separate set of data. In the context of service assurance, doing the comparison is the value. Comparing the past with present or projecting the future using the past. This provides analytics with insights.

Understanding Anomalies

When performing a compare, either they align or diverge within some set degree. If the current situation is a repeat of the past, you have detected a chronic situation. When the current situation is new and unusual its called an anomaly. Datascience.com defines three types of anomalies:

Point anomalies: A single instance of data is anomalous if it’s too far off from the rest. Business use case: Detecting credit card fraud based on “amount spent.”

Contextual anomalies: The abnormality is context specific. This type of anomaly is common in time-series data. Imagine spending $100 on food every day during the holiday season is normal, but may be odd otherwise.

Collective anomalies: A set of data instances collectively helps in detecting anomalies. Imagine someone is trying to copy data form a remote machine to a local host. If this odd, an anomaly that would flag this as a potential cyber attack.

Consider chronics as a “negative” anomalies. Many customers I have talked to, see chronic detection being the most common AI tool to have. Both anomalies and chronics are important AI tools operations can use to better monitor their estate.

Focusing on Service Assurance

Technology of Machine Learning

This blog focuses on technology as it applies to service assurance (my major focus). Service assurance, as defined by wikipedia, is:

“The application of policies and processes by a Communications Service Provider to ensure that services offered over networks meet a pre-defined service quality level for an optimal subscriber experience.”

Otherwise defined as assuring the quality of the service. Two ways to increase the quality of your services. One, automate resolutions, thus reducing how much issues impact customers. Two, proactively addressing issues before they become problems and problems before outages.

Increasing Automation with Machine Learning

Increasing automation, reducing downtime, is always a goal for operations. Machine learning with AI tools focus on correlation. The ability to segment faults into new terms like “situations”. Addressing each situation is key, but they must be actionable. Any correlation would leverage a machine learning built model. Then compare that model against the current fault inventory to produce the segmentation. This is the current focus of the industry today with mixed results as we will discuss later.

Being Proactive with Machine Learning

Being proactive is another buzzword from the late 90s — that was a decade ago right? The reality is its hard, you need to provide educated guesses. You can make quality guesses if you don’t have the data. Data reduction, prevalent in the industry, has caused operations to discard 99% (or more) of their data. Without this data your guesses will be poor. A machine learning model with AI can leverage low level information that may predict a future outage. That prediction will give operations lead time to fix the issue before an outage occurs. This is the hype focus on the industry today.

It is important to provide context when discussing machine learning with AI. This helps the us all understand the technology to enable the deeper discussion. The concepts and terms are new, but the industry hasn’t changed. Next up will be applying this technology to problems experienced by the industry.

Article Map

About the Author

Serial entrepreneur and operations subject matter expert who likes to help customers and partners achieve solutions that solve critical problems. Experience in traditional telecom, ITIL enterprise, global manage service providers, and datacenter hosting providers. Expertise in optical DWDM, MPLS networks, MEF Ethernet, COTS applications, custom applications, SDDC virtualized, and SDN/NFV virtualized infrastructure. Based out of Dallas, Texas US area and currently working for one of his founded companies – Monolith Software.

I’m alive! Sorry for the lengthy lapse in updates, but things have been so busy. With the release of some of my work as of late, I now have some time to share what I have been working on. As the title suggests, its all around machine learning and artificial intelligence (AI). Despite being hot buzzwords, there are tons of success stories using the technology today. To explain what I have learned, I have written a blog series. My hope is to cover my journey in machine learning with artificial intelligence.

Blog Focus: Machine Learning

First focus is around the technology. As with all new technology it has new terms and new concepts. Several of these are heretical to the status quo, so its important to set a proper level set. Then we want to discuss where it applies. What problems does it solve? How well do they solve them? How do the new solutions compare to legacy ones?

Solving Problems with Machine Learning

Now we have a firm introduction, lets solve some problems. First with the use case of fault storm management. How are storms detected and mitigated? What are the rewards of using ML/AI when applied?

My next favorite use case is around fault stream reduction. Fewer faults mean less effort for operations. Can ML/AI help? How well does it work? How hard is it to use?

Operational performance management is a touchy subject, but a worthwhile exercise. Why should you monitoring your NOC? How can help operations without being Orwellian about it?

Chronic Detection & Mitigation is a common use case for operations. How does operations iron out the wrinkles of their network? Can operations know when to jump on a chronic to fix them for good. Getting to 99.999% is hard without addressing chronic problems.

What is the Future of Machine Learning

As part of this series, we should address the future of this technology. With ML/AI being so popular, where should this technology be applied? How can it help with service assurance to make an impact.

The plan is to wrap up the series with a review of what is currently available in the marketplace. So we are all are aware of what is current versus what is possible.

Stay tuned, the plan is to release the blogs weekly. Don’t be afraid to drop comments or questions, as would love to do a AMA or blog on the questions.

Article Map

About the Author

Shawn Ennis

Founder, Father, and Serial Entrepreneur. Here is where I come to work, play, and share. Follow me on Twitter @Shawn_Ennis or Linkedin.

Monthly Archives: June 2018

The Technology of Machine Learning with AI