In their new paper "Improving Vulnerability Remediation Through Better Exploit Prediction," researchers from Cyentia, Virginia Tech, and the RAND Corporation take a detailed look at one of the most common infosecurity problems. The paper was given at the 2019 Workshop on the Economics of Information Security in Boston.
The researchers think that a key challenge that firms face is trying to identify a vulnerability remediation strategy that best balances two competing forces.
One strategy could attempt to patch all vulnerabilities on its network. While this would provide the greatest coverage of vulnerabilities patched, it also will inefficiently consume available resources by fixing low-risk vulnerabilities.
On the other hand, patching a few high-risk vulnerabilities would be highly efficient, but may leave the firm exposed to many other high-risk vulnerabilities.
They linked a large number of datasets with some machine learning to compare how various strategies delivered in real life.
However, they made a significant change in their modeling compared to other studies. Rather than just use predictions of exploitability, they used "exploits in the wild" as their outcome variable. It had to be real and out there for them to count it.
They said that, "Notably, we observe exploits in the wild for 5.5% of vulnerabilities in our dataset compared to 1.4% in prior works."
Their dataset was quite extensive. It may be one of the largest used in such a study.
The study looked at security flaws, scores and vulnerability characteristics that were found in NIST's National Vulnerability Database (NVD).
They also found data about what was "in the wild" from FortiGuard Labs, as well as the SANS Internet Storm Center, Secureworks CTU, Alienvault's OSSIM metadata and ReversingLabs metadata.
Kenna Security provided a count of the prevalence of each vulnerability which was derived from Kenna's scans of hundreds of corporate (customer) networks.
The paper found that 4,183 security flaws from the total of 76,000 vulnerabilities that were discovered between 2009 and 2018 had been exploited in the wild.
Dispelling one of the most common myths in information security, they established that there was no correlation between the time of publication of proof-of-concept (PoC) exploit code on public websites and the start of exploitation attempts.
Another finding was that vulnerabilities that are exploited in the wild usually have a high CVSSv2 severity score. They think that going after vulnerabilities with a high CVSSv2 score and have been found as well to be active in the wild will get the most remediation done at the least effort.
But the study tried to find the best strategy among possible choices. They found that, "Overall, like the other rules-based strategies, focusing on individual features (whether CVSS, published exploit, or reference tags) as a decision point yields inefficient remediation strategies."
Whatever the model used is, they found that a machine learning approach used in remediation will reduce costs while it can maintain the existing level of coverage.
— Larry Loeb has written for many of the last century's major "dead tree" computer magazines, having been, among other things, a consulting editor for BYTE magazine and senior editor for the launch of WebWeek.