Splunk Enhances Machine Learning With 300 New Algorithms


ORLANDO — As it wraps up its annual user conference here this week, San Francisco-based Splunk has one main message for the market. 

It’s going big — or, given that this is football season — long with machine learning. In other words, it has expanded in a significant way the early version of machine learning in its platform to deliver new services and capabilities.

It has added machine learning to the core of its platform with a machine learning toolkit that can be installed as a free app on top of the Splunk Enterprise platform, Manish Sainani, principal product manager of Machine Learning at Splunk told CMSWire.

This toolkit provides 300 algorithms for machine learning, 27 of which are pre-packaged out of the box, spanning all of the major categories, he said.

To be precise, the 27 pre-packaged algorithms focus on such categories as clustering, recommendations, regression, classification and text analytics. The 300 algorithms, which have been mined from the five major open source libraries can custom applied at the system’s extension points. They too fall under those general categories, Sainani said.

Splunk’s Unified View Gets Smarter

Splunk has also enhanced the machine learning in its IT Service Intelligence (ITSI) platform it introduced this time last year

This offering was a major leap forward for Splunk as the application used analytics and machine learning to flag anomalies from the unified view of a company’s IT universe that Splunk so famously provides, find those anomolies’ root cause and then alert the user to their impact.

Here, though, would be a good place to stop and look back at Splunk’s core mission to truly understand how and why Splunk is using machine learning to remake its platform.

Collect, Index & Correlate Data

When Erik Swan and Rob Das co-founded Splunk in 2002 they had a vision of creating an IT framework with a Google-like interface to find problems in the infrastructure.

The canonical use case was a company whose IT was built on traditional three tier architecture and then when there was a problem the IT staff would engage in time-consuming hunt and peck search for the issue. They would drill down into each machine in each tier to see the logs and then try to isolate the problem.

What Splunk did was collect, index and  correlate all that data so it could be viewed in a time series — i.e. someone could see what was happening in the system when the website went down at 5:33 am.

It maintained it in one place so that multiple people within the organization could search it almost as easily as one can search Google.

Over time it added more features such as alerts and then later, extensibility points and the ability to monitor business services.

Enhancing Splunk ITSI

Then came ITSI.  

“ITSI was a way to up-level the value of our platform, make it hyper specific and relevant to users,” Jon Rooney, senior director of Product Marketing tells CMSWire one year later.

The machine learning embedded in ITSI, in the meanwhile, has been doing what machine learning is supposed to do, which is learn from and adapt to data.

Thanks to the data set that is Splunk’s customer base as well as Splunk’s own engineering creds, ITSI’s machine learning has gotten smarter — able to tell, for example, that what is normal system behavior at 2 pm on a Tuesday is not normal at 6 am on a Saturday. It is less likely to deliver false positives.

The changes are not machine learning for machine learning’s sake, Rooney says, “but rather are focused on delivering applied functions for our customers.”

ITSI’s New Capabilities

Specifically ITSI can now:

  • Train data, or as IT might call it apply dynamic thresholds. The system knows what is normal on what day and time automatically. “Splunk couldn’t do that before,” Rooney said, “not without massive customization.”
  • Detect multivariate anomalies. Splunk is able to tell what should happen, or not happen, when X occurs, followed by Y and then A.  
  • Better control events to make data more manageable. This includes event suppression, de-coupling, de-grouping, to reduce the noise of false positives.

Splunk Advanced Ecosystem

Taken as a whole — the machine learning toolkit, ITSI’s enhancement — means some very advanced use cases for customers, Sainani said.

Companies can predict outages before they happen, for example, or identify so-called bad actors in a security situation before any malfeasance happens.

In short, they can better identify patterns in data that are repeatable that could lead to unexpected events, Sainani said.

“That doesn’t necessarily mean the unexpected events are bad — they could be good too,” he said.


Source link