4 Machine Learning Challenges for Threat Detection - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IT Leadership // Security & Risk Strategy
07:00 AM
Christopher Perry, Lead Product Manager, BMC Software
Christopher Perry, Lead Product Manager, BMC Software

4 Machine Learning Challenges for Threat Detection

While ML can dramatically enhance an organization's security posture, it is critical to understand some of its challenges when designing security strategies.

Image: NicoElNino - stock.adobe.com
Image: NicoElNino - stock.adobe.com

The growth of machine learning and its ability to provide deep insights using big data continues to be a hot topic. Many C-level executives are developing deliberate ML initiatives to see how their companies can benefit, and cybersecurity is no exception. Most information security vendors have adopted some form of ML, however it’s clear that it isn’t the silver bullet some have made it out to be.

While ML solutions for cybersecurity can and will provide a significant return on investment, they do face some challenges today. Organizations should be aware of a few potential setbacks and set realistic goals to realize ML’s full potential.

False positives and alert fatigue

The greatest criticism of ML-detection software is the “impossible” number of alerts it generates -- think millions of alerts per day, effectively delivering a denial-of-service attack against analysts. This is particularly true of “static analysis” approaches that rely heavily on how threats look.

Even an ML-based detection solution that is 97% accurate may not help because, simply put, the math is not favorable.

Let’s say organizations have one threat among 10,000 users on their network. Thanks to Bayes’ law, we can calculate an alert is truly a positive attack by multiplying 0.97 (for 97% accuracy) by the chance of an actual threat amongst all users, or 1/10,000. This means that even with 97% accuracy, the actual likelihood of an alert being a real attack is 0.0097%!

Since improving beyond 97% may not be feasible, the best way to address this is to limit the population under evaluation by whitelisting or prior filtering with domain expertise. This could mean focusing on highly credentialed, privileged users or a specific vital part of the business unit.

Dynamic environments

ML algorithms work by learning the environment and establishing baseline norms before they monitor for anomalous events that can indicate a compromise. However, if the IT enterprise is constantly reinventing itself to meet business agility needs and the dynamic environment doesn’t have a steady baseline, the algorithm cannot effectively determine what is normal and will issue alerts on completely benign events.

To help minimize this impact, security teams must work within DevOps environments to know what changes are being made and update their tooling accordingly. The DevSecOps (development, security, and operations) acronym is beginning to gain traction since each of these elements should be synchronized and work within a shared consciousness.


ML’s power comes from its ability to conduct massive multi-variable correlation to develop its predictions. However, when a real alert makes its way to a security analyst’s queue, this powerful correlation takes the appearance of a black box and leaves little more than a ticket that says, “Alert.” From there, an analyst must comb through logs and events to figure out why it triggered the action.

The best way to minimize this challenge is to enable a security operations center with tools that can quickly filter through log data on the triggering entity. This is an area where artificial intelligence can help automate and speed data contextualization. Data visualization tools can help as well by providing a fast timeline of events coupled with an understanding of a specific environment. A security analyst can then determine rapidly why the ML software sent the alert and whether it is valid.

Anti-ML attacks

The final challenge for ML is hackers who are quickly able to adapt and bypass detection. When that does occur, it can have catastrophic effects, as recent hackers demonstrated by causing a Tesla to accelerate to 85 MPH by altering a 35 MPH sign on a road.

ML in security is no different. A perfect example is an ML-network-detection algorithm that uses byte analysis to very effectively determine whether traffic is benign or shellcode. Hackers adapted quickly by using polymorphic blending attacks, padding their shellcode attacks with additional bytes to alter the byte frequency and fully bypass detection algorithms. It’s more ongoing proof that no one tool is bulletproof and security teams need to constantly assess their security posture and stay educated on the latest attack trends.

ML can be extremely effective in enabling and advancing security teams. The ability to automate detection and correlate data can save a significant amount of time for security practitioners.

However, the key to an improved security posture is human-machine teaming where a symbiotic relationship exists between machine (an evolving library of indicators of compromise) and man (penetration testers and a cadre of mainframe white-hat hackers). ML brings the speed and agility needed to stay ahead of the curve, and humans bring qualities that it can’t (yet) replicate -- logic, emotional reasoning, and decision-making skills based on experiential knowledge.

Christopher Perry is the Lead Product Manager for BMC AMI for Security at BMC Software. Perry got his start in cybersecurity while studying computer science at the United States Military Academy. While assigned to Army Cyber Command, Perry helped define expeditionary cyberspace operations as a company commander and led over 70 soldiers conducting offensive operations. He is currently getting his master’s degree in Computer Science with a focus in Machine Learning at Georgia Institute of Technology.

The InformationWeek community brings together IT practitioners and industry experts with IT advice, education, and opinions. We strive to highlight technology executives and subject matter experts and use their knowledge and experiences to help our audience of IT ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll