Snyk at RSAC 2021 — ML in SAST: Distraction or Disruption

Frank Fischer

June 30, 2021

Machine learning is a loaded term. While machine learning offers amazing potential for advancing technologies, it often gets used as a marketing buzzword describing glorified pattern recognition. So it becomes increasingly difficult to know if the application of machine learning to existing technology is going to break new ground or sell more licenses. That’s the problem that Frank Fischer, Product Marketing for Snyk Code, explores in his RSAC 2021 talk ML in SAST: Disruption or Distraction.

Static Application Security Testing (SAST) is a technology that finds security vulnerabilities in non-running source code. The goal of SAST is to find vulnerabilities as quickly and as early in the software development lifecycle (SDLC) as possible. As it uses source code as input for the analysis, it makes it the perfect place to apply symbolic AI, allowing SAST tools to run rules against code bases to see where problems exist.

While SAST has been around for a long time, it hasn’t had much in the way of disruption, only incremental improvement. We’ve seen steady improvements in accuracy, but improvements have come at the cost of computational intensity. This means scans that are both sound (no false positives) and complete (no missed vulnerabilities) can take hours or days to finalize. On top of that, SAST tools are often very good at catching problems, but very bad at providing actionable solutions. So the question is, can machine learning create the disruption that SAST needs to be fast, accurate, and actionable?

Before being able to answer that, though, it’s important to look at how SAST tools work. In order for them to find vulnerabilities, SAST tools need a knowledge base full of rules and methods for detecting malicious patterns. This knowledge base needs to be maintained by experts who research vulnerabilities and add new rules as symbolic AI. Knowledge base management is not something that can be automated yet, but it is something that can be improved with machine learning.

By using giant numbers of source code repositories (all available through open source) and applying machine learning, the experts that maintain SAST knowledge bases can exponentially increase the amount of vulnerabilities they can detect. By learning on billions of lines of code, soundness and completeness can take massive leaps. This sort of leap is a disruption.

But catching more vulnerabilities seems like an improvement, not a disruption. The true disruption would be in the application of machine learning to automate code fixes. The implementation of machine learning can’t just stop at detection, it also needs to be applied to the fixes that are applied to those vulnerabilities. When properly applied to SAST, machine learning has the ability to not just detect wrong patterns, but also know the correct patterns to apply for resolution. This is the disruption.

The final disruption that machine learning can offer is the ability to improve accuracy and actionability at scale. Every day, we have more computational power (ex: GPUs), better algorithms, and more pre-trained models (ex: GPT-3) to help machine learning perform at the scale needed to bring scan time down to minutes.

Based on all that, Frank comes to the conclusion that machine learning will in fact be a disruption in the field of SAST, providing improved speed, accuracy, and actionability. But even with the ability to disrupt, it can still be used as a distraction when implemented incompletely. When researching your next SAST, there are a few things you’ll need to consider:

Is some level of machine learning being applied in your SAST already? If not, where is it in the roadmap? Adding machine learning needs to be used to find and fix, and if it’s not on the roadmap for both, the SAST will fall short.
Does your SAST offer actionable solutions already? And does it do it directly from the tools developers use? If not, applying machine learning will only catch more vulnerabilities without offering solutions — meaning more work for developers and lower rates of adoption.
Does your SAST integrate directly into an automated pipeline? If not, there will still be a bottleneck, no matter how robust the machine learning implementation is. As an added note on this, in Snyk’s recent State of Cloud Native Application Security report, we found that 72% of fully automated teams fixed critical vulnerabilities in under a week. This number can be even higher with machine learning.

Watch the entire RSAC 2021 talk to learn more about machine learning in SAST. If you’re looking for a SAST tool that already offers speed, accuracy, and actionability, sign up and start using Snyk Code free.

Sign up for Snyk

Check your code, dependencies, containers, and IaC for security vulnerabilities — for free.

Join us at SnykCon 2021

We’re bringing development and security together in our free, 3-day virtual event focused on helping teams build securely.

Snyk at RSAC 2021 — ML in SAST: Distraction or Disruption

Snyk at RSAC 2021 — ML in SAST: Distraction or Disruption

Sign up for Snyk

Join us at SnykCon 2021

Recommend

让 API 好用的 9 个小技巧

mac小技巧之快速修改图片尺寸

One UI Watch

Graph-Native Scale: the Trillion+ Relationship Graph

Kubernetes && microk8s

最佳的管理者 – 库克

apple资讯：苹果即将推出国际系列 Apple Watch 表带和表盘

macOS 如何限制 CPU 进程占用？

Welcoming the Slovak Republic Government to Have I Been Pwned

揭晓 Windows 11 如何做到原生支持安卓应用

About Joyk