Here's why we can't have nice things
DATA MISAPPROPRIATION CHEAPENS MITRE ATT&CK EVALUATION, BUT HERE’S WHAT IS IMPORTANT...
As a former Gartner analyst who led the EPP Magic Quadrant, I’m having a blast reading the vendor write-ups explaining their performance against the APT3-esque set of tests in the recent MITRE ATT&CK assessment, with vendors grabbing for one scoring algorithm that gives a single overall metric to prove who’s the best.
The unfortunate truth is that vendors are making a circus out of what is the most comprehensive assessment of EPP + EDR vendor capabilities to date.
I’m disappointed that so few vendors have spent time to highlight that the ability to provide detection data is only part of the problem. It’s just as important to consider how that data is made available to your organization’s security operators, and how much faster your teams can make educated, confident decisions to contain threats and block adversaries.
The MITRE assessment has almost no way to take into account the blue team capabilities required to use the products effectively. So, the bottom line is, to quote myself from 14 months ago:
“Choose an EPP that improves the workflow, efficiency, and effectiveness of the tools and humans you have today.“
With that in mind, there are some elements in this assessment that are of more importance than others to most organizations evaluating any EPP or EDR vendor.
NONE – The product did not detect this activity at all, a complete miss.
NONE with Note – The product did not detect this activity as immediately malicious, but the data was captured and could be hunted.
SPECIFIC BEHAVIOR & SPECIFIC BEHAVIOR TAINTED – A specific detection that indicates exactly which malicious activity occurred. The modifier TAINTED means that this detection was made as a by-product of a detection far earlier in the attack chain, and the data is available somewhere to confirm this detection.
The first thing many IT Leaders ask is, “Which endpoint vendor blocks everything?”. The answer to that is a quick and easy, “None of them”.
After “Which vendor blocks everything?” comes the question, “Okay then, who missed the most?”. MITRE provides a good amount of data here which makes the question a little simpler to answer.
BUT – that doesn’t show the whole story. There are two versions of the detection type that MITRE uses to describe as missed malicious activity: “None” and “None with note”.
While “None” is self-explanatory – this activity completely bypassed detection methods and no data was collected that could point to it – “None with note” indicates that although a specific detection wasn’t highlighted by the vendor, the blue team was able to hunt and discover the activity. In other words, which vendors collect the right types of event data to best enable the advanced part of EDR – Threat Hunting.
That’s the advanced part of the “ease of use” spectrum, but what really matters to the IT Leaders I’ve spoken to in the past week is contained in the “SPECIFIC” and “SPECIFIC TAINTED” detection types. These mean that the platform tells you what matters, and you do not need an expert to tell you this specific event matters. It is the closest thing to usability guidance in this type of evaluation.
For those organizations with the time and expertise to invest in a full analysis tailored to their specific needs, the ATT&CK evaluation data is a gold mine.
You can dive into the results to see which vendors had a whole bunch of “DELAYED” detections (i.e hours and days) because they rely on a managed service to watch for suspicious activity. You can look at the highly FP-prone generic detections that might get lucky. And you can take the data, decide what detection methods matter most to you, and build a scorecard that's right for you.
As always, my cautions are that third party tests are a great indicator of whether a vendor is fit-for-purpose and can certainly help organizations narrow down a shortlist of vendors. For the first time, our industry has a useful view point into both prevention *and* detection/response capabilities. But they are data points not decision points. It feels like Groundhog Day having to repeat it over and over again but there is no one-size-fits-all, and usability should rank just as high in a shortlist or PoC.