# About security

# Security terms

Here we explain common security terms that are used in our tool.

# CVE

Common Vulnerability Enumeration - This is a vulnerability published in an open database by NVD, with an assigned vulnerability ID known as CVE ID. Examples include Heartbleed (CVE-2014-0160) and Shellshock (CVE-2014-6271).

# CVSS

Common Vulnerability Scoring System - An open framework for describing the severity of vulnerabilities, where each vulnerability is given a score between 0 and 10, 10 being critical.

# CWE

Common Weakness Enumeration - This is a weakness, either in software or in hardware, that may be exploited in a specific system. The CWE list is a tree hierarchy with different levels of abstraction. An example of a CWE tree chain, from high to low abstraction, may look like this: "Improper Restriction of Operations within the Bounds of a Memory Buffer" (CWE-119) -> "Buffer Copy without Checking Size of Input" (CWE-120) -> "Stack-based Buffer Overflow" (CWE-121).

# CPE

Common Platform Enumeration - This is a naming scheme for IT systems, software, and packages. An example of a CPE string for the React framework, version 16, is cpe:2.3:a:facebook:react:16.0.0:*:*:*:*:*:*:*.

# npm

Node Package Manager - A package manager for JavaScript consisting of a command line client npm along with an online database of packages known as the npm registry. npm handles local dependencies as well as global JavaScript tools. As of 2020, npm has joined forces with GitHub.

# NVD

National Vulnerability Database - An open database, managed by the U.S. government, for management of vulnerabilities. The information displayed is an aggregation of multiple sources along with a severity scoring using CVSS, the type of vulnerability as a CWE, and affected products as a CPE.

# Data sources

Ah, everything in this modern world begins with data! And so does Debricked.

Debricked's algorithms constantly scan various sources for information about vulnerabilities, licenses and health data. These include, but are not limited to, the NVD Database, NPM, C# Announcement, FriendsOfPHP's security advisories, Go Vulnerability Database, PyPA Python Advisory Database, GitHub Issues, GitHub Security Advisory, mailing lists, and more. We ping our sources every 15 minutes, giving fast and accurate data.

# Data refinement

When the data is collected we "clean up" since it is often quite messy. As our sources are a combination of structured and unstructured data, there is a lot of errors in it by default.

# An example; CVE-parsing

The largest source on vulnerabilities is the NVD database. Just like many other solutions, it is one of our primary sources. The problem with this source is that the CPEs (or products connected to vulnerabilities) are often mislabelled, and it's common to see a time-lag with up to four weeks in assigning CPEs to CVEs. Here we use our state-of-the-art natural language processing to re-classify the vulnerabilities and increase the amount of correctly classified vulnerabilities and reduce that time-lag to 0 days. This is one of many data-refinement activities that we carry out 24/7 for our customers.

# Fully automated - no humans

One of the key differentiators for the Debricked's tool is that we do not use any form of manual analysis of vulnerabilities. A risky bet which took almost 5 years of R&D to pull off! But as a result, as soon a vulnerability is discovered in a data source, we index it, refine it and try to find a fix. All of this can happen within 15-30 minutes. Moreover, we constantly monitor for changes in data regarding this particular vulnerability. In contrast, it takes an average of 30 days for a NVD database to complement their data with more details. Sometimes it is never done, if the vulnerability has low priority. The same is true for finding fixes and other details. But with Debricked you can rest assured that our systems are working around the clock and are not introducing any noticeable lag between the vulnerability sources and your developers. Debricked is here to assist you in building the bricks of security!

# Scanning the code for dependencies & matching

In the next step, Debricked will scan your projects for dependency files. This can be done in a variety of ways, as described further in this documentation, e.g. by CI/CD integrations (recommended), manual uploads and our API.

# What we look for

We essentially scan for any declared dependencies in files such as the famous "package-lock.json" file, "composer.json" and so forth. Next, this dependency file is transformed to our own internal format and is sent to our matching & rule-engine. Any indirect dependencies are also built/traversed in this process.

# Matching engine & rule engine

These are two pieces of software that A) Match your vendor and name of the dependency to our internal database, and B) Determine the likelihood of this match being correct. It is often the case that open source projects, unfortunately, have similar names, share parts of names or even have the same names but different vendors. Because of this, simple regular expressions and white/blacklists don't cut the deal. Again, we make use of modern tech such as machine-learning to determine the likelihood of the match being a true-positive or not based on our "secret sauce" algorithms which are currently being patented.

However, the accuracy of these algorithms varies depending on which language and package manager you are using. Read our benchmarks for more information on how well we support your stack. If it is not that great - be sure to let us know so that we can prioritise and boost the precision!

# Suggesting a solution to your problems

In most cases, the solution to vulnerabilities in open source dependencies is to simply update the dependency to a later version that is not vulnerable. Often the update is easy to make but if the gap in between the versions is large enough an update could cause breaking changes to your code. We help you figure out which version to update to by finding the smallest possible update you can do, which still fixes the vulnerability, helping you fix the problem while keeping the risk of breaking changes as low as possible!

Read more about solving vulnerabilities automatically through pull requests/merge requests or manually.

# How Vulnerable Functionality works

In order to perform Vulnerable Functionality analysis we need to know what parts of a library are vulnerable, and we need to know whether you are using those parts. Here we will explain how we figure this out.

# What parts of a library are vulnerable?

We start by fetching the latest vulnerable version and the first fixed version from our database. We then narrow this down to the smallest change we can find, using information from various sources, until we settle on two versions of the code, one vulnerable and one fixed. These versions can be released versions, git tags, git commits, or similar. Since whatever changed between these two versions fixed the vulnerability, the vulnerability must be contained in these changes. So, we download the code for these two versions, compute the difference and translate that into functions, classes and other code symbols, and store this in our database. Now we know, for a given vulnerability, what parts of the library are vulnerable. We call these parts the "vulnerable symbols" for a given vulnerability affecting a given dependency.

# Am I using the vulnerable parts?

In order to figure out if you use the vulnerable parts of the library, we need to know what parts you use. We do this by generating a call graph (opens new window) for your program and its libraries. This call graph is then uploaded to us alongside your dependency files, and we check to see if it contains the vulnerable symbols we found in the previous step. If we find those symbols, we know you are using the parts that changed in fixing the vulnerability, and are likely affected by the vulnerability. If your call graph does not contain the vulnerable symbols, you are most likely not using the vulnerable parts, and probably not affected by the vulnerability. Keep in mind that call graphs do not have perfect recall nor accuracy, that is to say they might not include all calls that could happen, or they could include calls that can't happen. For this reason you should never assume you are perfectly safe even if we say you are not using the vulnerable functionality, and you should still upgrade the dependency, though it's not as high priority as it would have been had we found that you use the vulnerable symbols.

# Real life example

A project using netty version 4.1.43 or older will, based on the information from Vulnerability databases such as Debricked's public vulnerability database, (opens new window) be affected by among others CVEs 2019-20444 (opens new window), 2019-20445 (opens new window) and 2020-11612 (opens new window). However, this is not necessarily the case as CVE-2019-20444 and CVE-2019-20445 affect the webserver part of netty (HTTP Request Smuggling), and CVE-2020-11612 affects decompression (Uncontrolled Resource Consumption due to not limiting size). As such, if one does not use netty for decompression, CVE-2020-11612 has no effect on the safety of the project and the user only has to upgrade to version 4.1.44 instead of 4.1.46. If one also does not use the webserver part of netty, one does not need to upgrade at all, saving lots of time and resources.

Normally this would have to be determined by manual investigation, requiring in-depth knowledge about not only your own project but also netty, and spending time reading through cve specifications to determine what parts of netty are affected by a given CVE. With Debricked's Vulnerable Functionality feature, this is all automated.