Automatically assessing exposure to known security vulnerabilities in third-party dependencies

(1)

Automatically assessing exposure to known security

vulnerabilities in third-party dependencies

Edward M. Poot

edwardmp@gmail.com

July 2016, 55 pages

Supervisors: dr. Magiel Bruntink

Host organisation: Software Improvement Group,https://www.sig.eu

Universiteit van Amsterdam

Faculteit der Natuurwetenschappen, Wiskunde en Informatica Master Software Engineering

(2)

Abstract

Up to 80 percent of code in modern software systems originates from the third-party components used by a system. Software systems incorporate these third-party components (’dependencies’) to preclude reinventing the wheel when common or generic functionality is needed. For example, Java systems often incorporate logging libraries like the popular Log4j library. Usage of such components is not without risk; third-party software dependencies frequently expose host systems to their vulnerabilities, such as the ones listed in publicly accessible CVE (vulnerability) databases. Yet, a system’s dependencies are often still not updated to versions that are known to be immune to these vulnerabilities. A risk resulting from this phenomenon when the dependency is not updated timely after the vulnerability is disclosed is that persons with malicious intent may try to compromise the system. Tools such as Shodan∗_{have emerged that can identify servers running}

a specific version of a vulnerable component, for instance the Jetty webserver version 4.2†_{, that is known}

to be vulnerable‡_{. Once a vulnerability is disclosed publicly, finding vulnerable systems is trivial using such}

tooling. This risk is often overlooked by the maintainers of a system. In 2011 researchers discovered that 37% of the 1,261 versions of 31 popular libraries studied contain at least one known vulnerability.

Tooling that continuously scans a systems’ dependencies for known vulnerabilities can help mitigate this risk. A tool like this, Vulnerability Alert Service (’VAS’), is already developed and in active use at the Software Improvement Group (’SIG’) in Amsterdam. The vulnerability reports generated by this tool are generally considered helpful but there are limitations to the current tool. VAS does not report whether the vulnerable parts of the dependency are actually used or potentially invoked by the system; VAS only reports whether a vulnerable version of a dependency is used but not the extent to which this vulnerability can actually be exploited in a system.

Links to a specific Version Control System revision (’commit’) of a system’s code-base are frequently in-cluded in so-called CVE entries. CVE entries are bundles of meta-data related to a specific software vulner-ability that has been disclosed. By using this information, the methods whose implementations have been changed can be determined by looking at the changes contained within a commit. These changes reveal which methods were involved in the conception of the vulnerability. These methods are assumed to con-tain the vulnerability. By tracing which of these vulnerable methods is invoked directly or indirectly by the system we can determine the actual exposure to a vulnerability. The purpose of this thesis is to develop a proof-of-concept tool that incorporates such an approach to assessing the exposure known vulnerabilities.

As a final step, the usefulness of the prototype tool will be validated. This is assessed by first using the tool in the context of SIG and then determining to what extent the results can be generalized to other contexts. We will show why tools like the one proposed are assumed to be useful in multiple contexts.

Keywords:software vulnerability, vulnerability detection, known vulnerabilities in dependencies, CVE, CPE, CPE matching, call graph analysis

∗_{https://www.shodan.io}

†_{https://www.shodan.io/search?query=jetty+4.2} ‡_{https://www.cvedetails.com/cve/CVE-2004-2478}

(3)

1 Introduction 1 1.1 Problem analysis. . . 1 1.2 Research questions . . . 2 1.3 Definitions . . . 3 1.4 Assumptions . . . 4 1.5 Research method . . . 5 1.6 Complexity . . . 5 1.7 Outline . . . 6 2 Related work 7 2.1 Tracking Known Security Vulnerabilities in Proprietary Software Systems . . . 7

2.2 Tracking known security vulnerabilities in third-party components . . . 8

2.3 The Unfortunate Reality of Insecure Libraries . . . 8

2.4 Impact assessment for vulnerabilities in open-source software libraries . . . 9

2.5 Measuring Dependency Freshness in Software Systems . . . 10

2.6 Monitoring Software Vulnerabilities through Social Networks Analysis . . . 10

2.7 An Analysis of Dependence on Third-party Libraries in Open Source and Proprietary Systems 11 2.8 Exploring Risks in the Usage of Third-Party Libraries . . . 12

2.9 Measuring Software Library Stability Through Historical Version Analysis . . . 12

2.10 An Empirical Analysis of Exploitation Attempts based on Vulnerabilities in Open Source Soft-ware . . . 13

2.11 Understanding API Usage to Support Informed Decision Making in Software Maintenance . 13 3 Research method 15 3.1 Introduction . . . 15

3.2 Client helper cycle. . . 16

3.2.1 Problem investigation . . . 16

3.2.2 Treatment design . . . 17

3.2.3 Design validation . . . 17

3.2.4 Implementation and Implementation evaluation. . . 17

3.3 Research cycle . . . 17

3.3.1 Research problem investigation . . . 17

3.3.2 Research design. . . 17

3.3.3 Research design validation. . . 17

3.3.4 Analysis of results . . . 18

3.4 Design cycle . . . 18

3.4.1 Problem investigation . . . 18

3.4.2 Artifact design . . . 19

3.4.3 Design validation . . . 19

3.4.4 Implementation and Implementation evaluation. . . 19

4 Designing a proof of concept tool 20 4.1 Research context. . . 20

(4)

CONTENTS

4.2.1 Gathering and downloading dependencies of a system . . . 21

4.2.2 Gathering CVE data relevant to included dependencies . . . 21

4.2.3 Establishing vulnerable methods . . . 22

4.2.4 Ascertaining which library methods are invoked . . . 22

4.2.5 Identifying vulnerable methods that are invoked . . . 22

4.3 Detailed approach for automatically assessing exposure to known vulnerabilities . . . 22

4.3.1 Determining vulnerable methods . . . 23

4.3.2 Extracting dependency information. . . 25

4.3.3 Creating a call graph . . . 25

4.3.4 Determining actual exposure to vulnerable methods . . . 29

4.3.5 External interface. . . 30

5 Evaluation 32 5.1 Conducting analysis on client projects. . . 32

5.1.1 Setup. . . 32

5.1.2 Results. . . 32

5.1.3 Interpretation . . . 33

5.2 Finding known vulnerabilities without using CVE databases . . . 35

5.2.1 Implementing retrieval of data from another source. . . 35

5.2.2 Setup. . . 38

5.2.3 Results. . . 39

5.3 Finding vulnerabilities through GitHub that are not listed in CVE databases . . . 41

5.3.1 Setup. . . 41

5.3.2 Results. . . 42

5.3.4 Conclusion . . . 44

5.4 Evaluating usefulness with security consultants . . . 45

5.4.1 Setup. . . 45

5.4.2 Results. . . 45

5.5 Reflection on usefulness. . . 46

5.5.1 Result analysis research cycle . . . 46

5.5.2 Implementation evaluation of the design cycle. . . 48

5.6 Threats to validity . . . 48

5.6.1 Conclusion validity . . . 49

5.6.2 Construct validity. . . 49

5.6.3 External validity . . . 49

6 Conclusion and future work 50 6.1 Answering the research questions . . . 50

6.1.1 To what extent is it possible to automatically determine whether vulnerable code in dependencies can potentially be executed? . . . 50

6.1.2 How can we generalize the usefulness of the prototype tool based on its usefulness in the SIG context? . . . 51

6.2 Future work . . . 52

Bibliography 53

(5)

Before you lies the result of five months of hard work. Although I am the one credited for this work, this thesis could not have been produced without the help of several people.

First of all I would like to thank Mircea Cadariu for his reflections on the research direction I should pursue. My gratitude goes out to Theodoor Scholte for his input on the tool I developed. I would also like to acknowledge Reinier Vis for connecting me with the right persons. Special thanks to Marina Stojanovski, Sanne Brinkhorst and Brenda Langedijk for participating in interviews or facilitating them. I want to give a shout-out to Wander Grevink for setting up the technical infrastructure used during my research.

I sincerely appreciate the advice and guidance of my supervisor Magiel Bruntink during this period. Fur-thermore, I would like to express my gratitude to anyone else in the research department at Software Im-provement Group (SIG) for their input — Xander Schrijen, Haiyun Xu, Baŕbara Vieira and Cuiting Chen. I would also like to thank all the other interns at SIG for their companionship during this period.

Finally, I would like to thank everybody else at SIG for providing me with the opportunity to write my thesis here.

Edward Poot

Amsterdam, The Netherlands July 2016

(6)

Chapter 1 Introduction

1.1 Problem analysis

In April of 2014, the cyber-security community came to know of a severe security vulnerability unprecedented in scale and severity. The vulnerability, quickly dubbed as ’Heartbleed’, was found in OpenSSL, a popular cryptography library that implements the Transport Layer Security (TLS) protocol. OpenSSL is incorporated in widely used web-server software like Apache, which powers the vast majority of websites found on the internet today. The library is also used by thousands of other systems requiring cryptographic functionality. After the disclosure of this vulnerability, security researchers identified at least 600.000 systems connected to the public Internet that were exploitable due to this vulnerability1_{. This specific security incident makes}

it painfully clear that there is a shadow side to the use of open-source software. The widespread adoption of open-source software has made such systems easy victims. Once a vulnerability is disclosed, it can be trivial for malicious persons to exploit thousands of affected systems.

Contrary to popular belief, analysis done byRansbotham(2010) corroborates that, when compared to pro-prietary systems, open source systems have a greater risk of exploitation, diffuse earlier and wider and know a greater overall volume of exploitation attempts. The OWASP Top Ten2_{exposes the most commonly}

occur-ring security flaws in software systems. Using components with known vulnerabilities is listed as number nine in the list of 2013. The emergence of dependency management tools has caused a significant increase in the number of libraries involved in a typical application. In a report ofWilliams and Dabirsiaghi(2012), in which the prevalence of using vulnerable libraries is investigated, it is recommended that systems and processes for monitoring the usage of libraries are established.

The SIG analyses the maintainability of clients’ software systems and certifies systems to assess the long-term maintainability of such systems. Security is generally considered to be related to the maintainability of the system. Use of outdated dependencies with known vulnerabilities provides a strong hint that maintain-ability is not a top priority in the system. Furthermore, IT security is one of the main themes of the work SIG fulfills for its clients. The systems of SIG’s clients typically depend on third-party components for common functionality. However, as indicated before this is not without risk. In security-critical applications, such as banking systems, it is crucial to minimize the time between the disclosure of the vulnerability and the appli-cation of a patch to fix the vulnerability. Given the increasing number of dependencies used by appliappli-cations, this can only be achieved by employing dedicated tooling.

In 2014 an intern at SIG, Mircea Cadariu (seeCadariu(2014);Cadariu et al.(2015)), modified an existing tool to be able to scan the dependencies of a system for vulnerabilities as part of his master’s thesis. The tool was modified to support indexing Project Object Model (POM)3_{files, in which dependencies of a system}

are declared when the Maven dependency management system is used. Interviews with consultants at SIG revealed that they would typically consider the vulnerability reports to be useful, even though false positives would frequently be reported. The interviewees mentioned that typically they would consider whether the vulnerability description could be linked to functionality in dependencies that the client uses. However, a consultant may mistakenly think that the vulnerable code is never executed since this kind of manual

1_{http://blog.erratasec.com/2014/04/600000-servers-vulnerable-to-heartbleed.html} 2_{https://www.owasp.org/index.php/Top_10_2013-Top_10}

(7)

verification is prone to human error. Furthermore, the need for manual verification by humans means that the disclosure of a critical and imminent threat to the client may be delayed. We propose to create a prototype tool that will automatically indicate the usage of vulnerable functionality.

Plate et al.(2015) have published a paper in which a technique is proposed to identify vulnerable code in dependencies based on references to Common Vulnerabilities and Exposures (CVE) identifiers in the commit messages of a dependency. CVE identifiers are assigned to specific vulnerabilities when they are disclosed. The issue with this approach was that CVE identifiers were rarely referenced in commit messages, at least not structurally. In addition, manual effort was required to match Version Control System (VCS) repositories to specific dependencies. Moreover,Plate et al.(2015) indicate that once a vulnerability is confirmed to be present in one of the systems’ dependencies, they are regularly still not updated to mitigate the risk of exposure. In the enterprise context this can be attributed to the fact that these systems are presumed to be mission-critical. Hence, downtime has to be minimized. The reluctance to update dependencies is caused by beliefs that new issues will be introduced by updating. Because of these kind of beliefs there is an urge to carefully assess whether a system requires an urgent patch to avert exposure to a vulnerability or whether this patch can be applied during the application’s regular release cycle; a vulnerability that is actually exploitable and can be used to compromise the integrity of the system would require immediate intervention, while updating a library with a known vulnerability in untouched parts can usually be postponed.

Bouwers et al.(2015) state that prioritizing dependency updates proves to be difficult because the use of outdated dependencies is often opaque. The authors have devised a metric (’dependency freshness’) to indicate whether recent versions of dependencies are generally used in a specific system. After calculating this metric for 75 systems, the authors conclude that only 16.7% of the dependencies incorporated in systems display no update lag at all. The large majority (64.1%) of the dependencies used in a system show an update lag of over 365 days, with a tail of up to 8 years. Overall, it is determined that it is not common practice to update dependencies on a regular basis in most systems. It is also discovered that the freshness rating has a negative correlation with the number of dependencies that contain known security vulnerabilities. More specifically, systems with a high median dependency freshness rating know a lower number of dependencies with reported security vulnerabilities and vice versa. However, these metrics do not take in account how the dependency is actually used by the system. The tool we propose would be able to justify the urge to update dependencies by showing that a system is actually vulnerable; the risk of using outdated dependencies is no longer opaque.

Raemaekers et al.(2011) sought to assess the frequency of use of third-party libraries in both proprietary and open source systems. Using this information, a rating is derived based on the frequency of use of partic-ular libraries and on the dependence on third-party libraries in a software system. This rating can be used to indicate the exposure to potential security risks introduced by these libraries. Raemaekers et al.(2012a) continue this inquiry in another paper, the goal of which was to explore to what extent risks involved in the use of third-party libraries can be assessed automatically. The authors hypothesize that risks in the usage of third party libraries are influenced by the way a given system is using a specific library. They do not rely on CVE information but the study does look at Application Programming Interface (API) usage as an indicator of risk.

We can conclude from the existing literature reviewed that vulnerabilities introduced in a system by its dependencies are a prevalent threat in today’s technological landscape. Various tools have been developed aiming to tackle this problem. However, a tool that tries to determine the actual usage of the API units introducing the vulnerable behavior is currently lacking to our knowledge. Therefore, the problem we seek to solve is assessing how we can automatically determine actual exposure to vulnerabilities introduced by a system’s dependencies rather than hypothetical exposure alone. A proof-of-concept tool will be created to indicate the feasibility of this approach. We will evaluate this tool in the context of our host company (SIG). Furthermore, we will generalize the usefulness of a tool featuring such functionality in multiple contexts.

1.2 Research questions

Research question 1 To what extent is it possible to automatically determine whether vulnerable code in dependencies can potentially be executed?

(8)

CHAPTER 1. INTRODUCTION

– How can we determine which methods of a dependency are called directly or indirectly? – How do we determine which code was changed to fix a CVE?

– How can we validate the correctness of the prototype tool we will design?

Research question 2 How can we generalize the usefulness of the prototype tool based on its usefulness in the SIG context?

– In what ways can the tool implementing the aforementioned technique be exploited in useful ways at SIG?

– In what ways is the SIG use case similar to other cases?

1.3 Definitions

First, we will establish some common vocabulary that will be used in the remainder of this thesis. An overview of the acronyms we use is also provided at the end of this thesis.

Software vulnerabilities According to the Internet Engineering Task Force (IETF)4a software vulnerabil-ity is defined to be: “a flaw or weakness in a system’s design, implementation, or operation and management that could be exploited to violate the system’s security policy”. For the purpose of this thesis, we are primarily concerned with known vulnerabilities. These are vulnerabilities that have been disclosed in the past through some public channel.

CVE CVE is the abbreviated form of the term Common Vulnerabilities and Exposures. Depending on the context, it can have a slightly different meaning, but in all circumstances CVE relates to known security vulnerabilities in software systems.

First of all, CVE can be used to refer to an identifier assigned to a specific security vulnerability. When a vulnerability is disclosed, it will be assigned an identifier of the form “CVE-YYYY-1234”. More specifically, the CVE prefix is added, followed by the year the vulnerability was discovered in. Finally, a number unique to all discovered vulnerabilities in that year is added to the suffix. This identifier serves as a mechanism through which different information sources can refer to the same vulnerability.

Secondly, a CVE can refer to a bundle of meta-data related to a vulnerability identified by a CVE identifier, something to which we will refer as CVE entry. For instance, a score indicating the severity of vulnerability (“CVSS”) is assigned as well as a description indicating how the vulnerability manifests. Moreover, a list of references is attached, which basically is a collection of links to other sources that have supplementary information on a specific vulnerability.

Finally, CVE is sometimes used synonymously with the databases containing the CVE entries. This is something we will refer to as CVE databases from now on. The National Vulnerability Database (NVD) is a specific database that we will use.

CPE CPE is an acronym for Common Platform Enumeration. One or more CPEs can be found in a CVE entry. CPEs are identifiers that identify the platforms affected by a specific vulnerability.

VCS VCS is an abbreviation for Version Control System. This refers to a class of systems used to track changes in source code over time. Version Control Systems use the notion of revisions. For instance, the initial source code that is added is known as revision one, but after the first change is made the revision two is the state the code is in.

As of 2016, the most popular VCS is Git. Git is a distributed VCS, in which the source code may be dispersed over multiple locations. Git has the concept of repositories, in which such a copy of the source code is stored. The website GitHub is currently the most popular platform for hosting these repositories.

In Git, revisions are called commits. Moreover, Git and GitHub introduce other meta-data concepts such as tags and pull requests respectively. We will commonly refer to such pieces of meta-data as VCS artifacts. GitHub also introduces the notion of issues, through which problems related to a system can be discussed.

(9)

Dependencies Software systems often incorporate third-party libraries that provide common functionality to preclude developing such functionality in-house and thereby reinventing the wheel. The advantages of using such libraries includes shortened development times and cost savings due to not having to develop and maintain such components.

Since a system now depends on these libraries to function, we call these libraries the dependencies of a system. New versions of libraries containing bug-fixes and security improvements may be released by the maintainers. To aid in the process of keeping these dependencies up-to-date, dependency management systemshave emerged. One of the most popular dependency management systems is Maven, a dependency management system for applications written in the Java programming language. In Maven, the dependencies are declared in an XML file referred to as the Project Object Model file, or POM file in short.

1.4 Assumptions

Based on initial analysis conducted, we have established the following assumptions about known security vulnerabilities:

Assumption 1 It is becoming increasingly more likely that CVE entries refer to VCS artifacts. Assumption 2 The commits referred to in CVE entries contain the fix for the vulnerability.

Assumption 3 The methods whose implementation has been changed as indicated by the commit contain the fix for a vulnerability.

We will substantiate each assumption in the following paragraphs.

It is becoming increasingly more likely that CVE entries contain references to VCS artifacts The approach we envision to assess the actual exposure to vulnerabilities heavily relies on the presence of VCS references in CVE entries. The percentage of CVE having at least one VCS reference is still quite low (6,48% to be precise5_{) but over the years we signify a positive trend. Figure}_1.1_{provides a graphical depiction of this}

trend. With the notable exception of the year 2015, the absolute number of CVE entries having at least one VCS reference is increasing year over year. The year 2015 deviates from this trend probably simply due to the fact that the absolute number of CVEs in that year is lower than in other years.

Figure 1.1: The absolute number of CVE in the NVD database having at least one VCS reference increases almost every year.

(10)

CHAPTER 1. INTRODUCTION

The commits referred to in CVE entries contain the fix for the vulnerability Based on manual ex-amination of several CVE entries, it appears that when there is a reference to a commit or other VCS artifact, the code changes included in that commit encompass the fix for the vulnerability. There are corner cases where this does not apply; we already encountered a commit link that referred to an updated change-log file indicating that the problem was solved instead of the actual code changing to remedy the problem. This does not matter in our case, since we only take source code into account.

The methods whose implementation has been changed as indicated by the commit contain the fix for a vulnerability We have analyzed a number of patches. Regularly, when a vulnerability is disclosed publicly, only certain method implementations are changed to fix the vulnerability. A helpful illustration is the commit containing the fix for the now infamous Heartbleed vulnerability (CVE-2014-0160) in the OpenSSL library mentioned at the beginning of this chapter. After investigating the related CVE, we observe that there indeed is a link to the commit containing the fix as expected. When looking at the modifications in the respective commit6_{we can observe that, apart from added comments, only a single method implementation}

was changed — the one containing the fix for the vulnerability.

1.5 Research method

We will employ Action Research to evaluate the usefulness of a prototype tool that can automatically assess exposure to known vulnerabilities. More specifically, we employ Technical Action Research (TAR). Our in-stantiation of TAR is presented in Chapter3. Action Research is a form of research in which researchers seek to combine theory and practice (Moody et al.,2002;Sjøberg et al.,2007). The tool will be created in the context of our host company, the Software Improvement Group (SIG) located in Amsterdam. First, the usefulness of such a tool is determined in the context of this company, later on we will try to determine the components that contribute to this perceived usefulness and hypothesize if they would also contribute to hypothesized usefulness in other contexts. During the initial study of the usefulness in the context of the host organization of the prototype tool, potential problems threatening the usefulness of the tool can be solved.

1.6 Complexity

There are a lot of moving parts involved in the construction of the prototype tool that need to be carefully aligned to obtain meaningful results. These complexities include working with a multitude of vulnerability sources and third-party libraries. We need to interact with local and remote Git repositories, retrieve infor-mation using the GitHub API, invoke Maven commands programmatically, conduct call graph analysis, work with existing vulnerability sources and parse source code.

Limitations of using CVEs CVE databases can be used, but they are known to have certain limitations. A limitation we are aware of is that the correct matching between information extracted from dependency management systems and CPE identifiers is not always possible due to ambiguities in naming conventions. Heuristics can be employed to overcome some of these limitations.

Working with APIs of GitHub/Git We could use the GitHub API to retrieve patches included in a specific commit. However, not all open-source dependencies use GitHub; they may also serve Git through private servers. Fortunately, we can also clone a remote repository locally using JGit7_{to obtain patch information.}

In addition, the GitHub API for issues can be used to obtain other meta-data that could be of interest to us. Call graph analysis Once we have retrieved the relevant patches for our library and derived a list of methods that are expected to be vulnerable, we need to determine if these methods are executed directly or indirectly by the parent system. This can be achieved using a technique better known as call graph analysis. Call graph analysis tools are available for analysing virtually any programming language. There is also a huge body of research available explaining the currently used methods, static or dynamic analysis, in detail.

6_{https://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=96db902} 7_{https://eclipse.org/jgit}

(11)

Also, we need to know the limitations of these tools. All call graph tools identified for Java have issues in processing source code as opposed to JAR files containing bytecode. Therefore, a different method needs to be devised to trace the initial method call within a system’s source code to a library method. Based on evaluating various tools to generate call graphs, we expect that we can reliably determine this under normal circumstances. With normal circumstances it is meant that method invocation through reflection is usually not traced by call graph libraries. Nonetheless, in general we don’t expect that systems would extensively use reflection to interact with third-party libraries.

1.7 Outline

The rest of this thesis is structured as follows. We will first examine related work. This is followed by explain-ing our instantiation of TAR. Then, we will describe both the high-level design and low-level implementation of our prototype tool. This is followed by an evaluation of the usefulness of the tool. Finally, we will answer the research questions in the conclusion.

(12)

Chapter 2 Related work

In this chapter we will review related work on the topic of known vulnerabilities in third-party components. The goal of the chapter is to provide insight into the prevalence of the problem and the research that has been conducted related to this topic so far.

2.1 Tracking Known Security Vulnerabilities in Proprietary

Soft-ware Systems

Cadariu et al.

(

2015

)

Software systems are often prone to security vulnerabilities that are introduced by the third party com-ponents of a system. Therefore, it is crucial that these comcom-ponents are kept up to date by providing early warnings when new vulnerabilities for those dependencies are disclosed allowing appropriate action to be taken.

A high level description of an approach that could be followed for creating a tool that provides such early warnings is given. In modern build environments, dependency managers — such as Maven for Java projects — are used. These tools process information relating to the dependencies needed to be included found in a structured XML file. For Maven systems this is called the POM file. This file can then be used to gather a list of dependencies used by the project, as opposed to other strategies, such as looking at import statements in Java code. This approach can easily be extended for dependency managers in other programming languages that use similar configuration files, such as Python (PyPi), Node.js (NPM), PHP (composer) and Ruby (Gems). As a source of vulnerability data existing CVE databased are used. Common Platform Enumerations (CPE) identifiers contained within CVE reports uniquely identify affected platforms.

An existing system, OWASP Dependency Check, that already features some requested functionality is employed and extended to support retrieving dependencies from POM files.

A matching mechanism is devised to match dependency names retrieved from Maven with CPE identifiers. For example, a specific Maven dependency can be identified as “org.mortbay.jetty:jetty:6.1.20” and the CPE is “cpe:/a:mortbay:jetty:6.1.20”. False positives and false negatives rates are determined by calculating precision and recall by randomly looking at 50 matches and determine whether the match is relevant. Precision is quite low (14%), while the recall is higher (80%).

The prevalence of the known-vulnerabilities-in-dependencies phenomenon in practice is assessed. A total of 75 client systems available at SIG are used to test the prototype tool with. The majority of them, 54, have at least one vulnerable dependency, while the maximum is seven vulnerable dependencies.

Finally, technical consultants working at the host company evaluate the usefulness of such a system in practice. Interviews with consultants working at SIG are held to discuss the analysis results. Without the system, respondents would not have considered outdated dependencies and their impact on the security of the system. One specific customer was informed and he was very fond of the detection of this vulnerability

(13)

in his system.

The problem investigated is partially similar to the topic we are researching. The difference between this approach and our topic is that the tool proposed in this paper does not support reporting whether a identified vulnerability really affects the the system, e.g. to what extent the reported vulnerable methods or classes are actually used. In addition, like in this research we are also interested in evaluating the usefulness of a security tool like this.

2.2 Tracking known security vulnerabilities in third-party

compo-nents

Cadariu

(

2014

)

The paper "Tracking Known Security Vulnerabilities in Proprietary Software Systems" described previ-ously is based on this prior research, which is a thesis. The thesis expands a bit on several topics but the information is largely the same but a bit more detailed. The goal of this thesis is to propose a method to con-tinuously track known vulnerabilities in third party components of software systems and assess its usefulness in a relevant context.

All potential publicly available sources of vulnerability reports (CVEs) are considered. Eventually it is determined to use the NVD, because it appears to be the only one at that time that offered XML feeds listing the vulnerabilities.

Finally, interviews with consultants at SIG are conducted to assess the usefulness of the prototype tool that was developed during the course of this research. Evaluation shows that the method produces useful security-related alerts consistently reflecting the presence of known vulnerabilities in third party libraries of software projects.

This study has shown that the NVD database has proven to be the most useful vulnerability database for this kind of research. This is due to its adequacy for the research goal and convenient data export features. This database contains known vulnerabilities that have been assigned a standardized CVE identifier. However, for a vulnerability to be known, it does not necessarily need to go through the process that leads to a CVE assignment. Some security vulnerabilities are public knowledge before receiving a CVE identifier, such as when users of open-source projects signal security vulnerabilities. Ideally, tracking known vulnerabilities would mean indexing every possible source of information that publishes information regarding software security threats. In this research this has not been investigated. In our research we will keep in mind that CVE databases are not the only data source for vulnerabilities might we run into problems with these traditional sources of vulnerability information.

2.3 The Unfortunate Reality of Insecure Libraries

Williams and Dabirsiaghi

(

2012

)

This article shows the prevalence and relevance of the issue that is using libraries with known vulnerabilities. The authors show that there are significant risks associated with the use of libraries.

A significant majority of code found in modern applications originates from third party libraries and frame-works. Organizations place strong trust in these libraries by incorporating them in their systems. However, the authors discover that almost 30% of the downloaded dependencies contain known vulnerabilities after analyzing nearly 30 million downloads from the Maven Central dependency repository. The authors con-clude that this phenomenon proves that most organizations are not likely to have a strong policy in place for keeping libraries up to date to prevent systems becoming compromised by the known vulnerabilities in the dependencies used.

The security aspect of in-house developed code is normally given proper security attention, but, in contrast, the possibility that risk comes from third party libraries is barely considered by most companies. The 31 most downloaded libraries are closely examined. It turns out 37% of the 1261 versions of those libraries contain known vulnerabilities. Even more interesting is that security related libraries turn out to be 20% more likely to have reported security vulnerabilities than, say, a web framework. It is expected that these libraries simply

(14)

CHAPTER 2. RELATED WORK

have more reported vulnerabilities due to the nature of the library; they simply receive more attention and scrutiny from researchers and hackers.

Finally, it is found that larger organizations on average have downloaded 19 of the 31 most popular Java libraries. Smaller organizations downloaded a mere 8 of these libraries. The functionality offered by some of these libraries overlaps with functionality in other libraries. This is a concern because this indicates that larger organizations have not standardized on using a small set of trusted libraries. More libraries used means more third-party code is included in a system, and more code leads to a higher chance of security vulnerabilities being present.

The authors conclude that deriving metrics indicating what libraries are in use and how far out-of-date and out-of-version they are would be a good practice. They recommend establishing systems and processes to lessen the exposure to known security vulnerabilities introduced by third-party dependencies as the use of dependency management tools has caused a significant increase in the number of libraries involved in a typical application.

2.4 Impact assessment for vulnerabilities in open-source software

libraries

Plate et al.

(

2015

)

Due to the increased inclusion of open source components in systems, each vulnerability discovered in a bundle of dependencies potentially jeopardizes the security of the whole application. After a vulnerability is discovered, its impact on a system has to be assessed. Current decision-making is based on high-level vulnerability information descriptions and expert knowledge, which is not ideal due to effort that needs to be exercised and due to its proneness to errors. In this paper a more pragmatic approach to assess the impact is proposed.

Once a vulnerability is discovered, the dependencies of a system will sometimes still not be updated to neutralize the risk of exposure. In the enterprise context this can be attributed to the fact that these systems are mission-critical. Therefore, downtime has to be minimized. The problem with updating dependencies is that new issues may be introduced. Enterprises are reluctant to update their dependencies more frequently for this reason. Due to these convictions, system maintainers need to carefully assess whether an application requires an urgent application patch or whether this update can be applied during the application’s regular release cycle. The question that arises is whether it can be determined if any vulnerability found in a depen-dency originates from parts of the dependepen-dency’s API that are used by the system. In this paper a possible approach to assess this is described.

The following assumption is made: Whenever an application incorporates a library known to be vulnerable and executes a fragment of the library that contains the vulnerable code, there is a significant risk that the vulnerability can be exploited. The authors collect execution traces of applications, and compare those with changes that would be introduced by the security patches of known vulnerabilities in order to detect whether critical library code is executed. Coverage is measured by calculating the intersection between programming constructs that are both present in the security patch and that are, directly or indirectly, executed in the context of the system.

Practical problems arise due to use of different sources such as VCS repositories and CVE databases. This is mainly attributed to the use of non-standardized methods to refer to a certain library and versions.

The authors state that once a vulnerability is discovered, its impact on a system has to be assessed. Their intended approach is a bit similar to ours; look at the VCS repositories of dependencies and try to determine the changes that have occurred after the vulnerable version was released, up to the point the vulnerability was patched. However, manual effort is needed to connect CVE entries to VCS repositories.

A key problem that their approach faces is how to reliably relate CVE entries with the affected software products and the corresponding source code repository, down to the level of accurately matching vulnerability reports with the code changes that provide a fix for them. This information was apparently unavailable or went unnoticed when their research was conducted as our preliminary investigation shows that VCS links are often even referenced in the CVE entry, there is no need to manually provide this information for each dependency.

(15)

2.5 Measuring Dependency Freshness in Software Systems

Bouwers et al.

(

2015

)

Prioritizing dependency updates often proves to be difficult since the use of outdated dependencies can be opaque. The goal of this paper is making this usage more transparent by devising a metric to quantify how recent the versions of the used dependencies are in general. The metric is calibrated by basing the thresholds on industry benchmarks. The usefulness of the metric in practice is evaluated. In addition, the relation between outdated dependencies and security vulnerabilities is determined.

In this paper, the term “freshness” is used to denote the difference between the used version of a dependency and the desired version of a dependency. In this research the desired situation equates to using the latest version of the dependency. The freshness values of all dependencies are aggregated to the system-level using a benchmark-based approach.

A study is conducted to investigate the prevalence of the usage of outdated dependencies among 75 Java systems. Maven POM files are used to determine the dependencies that are used in systems. When consider-ing the overall state of dependency freshness usconsider-ing a version sequence number metric, the authors conclude that only 16.7% of the dependencies display no update lag at all; e.g. the most recent version of a dependency is used. Over 50% of the dependencies have an update lag of at least 5 versions. The version release date distance paints an even worse picture. The large majority (64.1%) of the dependencies have an update lag of over 365 days, with a tail up to 8 years. Overall, the authors conclude that apparently it is not common practice to update dependencies on a regular basis.

Given the measurement of freshness on the dependency level, a system level metric can be defined by aggregating the lower level measurements. This aggregation method works with a so-called risk profile that in this case describes which percentage of dependencies falls into one of four risk categories.

To determine the relationship between the dependency freshness rating and security vulnerabilities the authors calculate the rating for each system and determine how many of the dependencies used by a system have known security vulnerabilities.

The experiment points out that systems with a high median dependency freshness rating show a lower number of dependencies with reported security vulnerabilities. The opposite also holds. Moreover, systems with a low dependency freshness score are more than four times as likely to incorporate dependencies with known security vulnerabilities.

This study relates to our topic due to the fact that it shows there is a relation between outdated dependencies and security vulnerabilities. The tool we propose can justify the importance to update dependencies by showing the vulnerabilities the system is else exposed to; the use of outdated dependencies is no longer opaque.

2.6 Monitoring Software Vulnerabilities through Social Networks

Analysis

Trabelsi et al.

(

2015

)

Security vulnerability information is spread over the Internet and it requires manual effort to track all these sources. Trabelsi et al.(2015) noticed that the information in these sources is frequently aggregated on Twitter. Therefore, Twitter can be used to find information about software vulnerabilities. This can even include information about zero-day exploits that are not yet submitted to CVE databases. The authors propose a prototype tool to index this information.

First, a clustering algorithm for social media content is devised, grouping all information regarding the same subject matter, which is a pre-requisite for distinguishing known from new security information.

The system is comprised of two subsystems, a data collection and a data processing part. The data col-lection part stores information including common security terminology such as “vulnerability” or “exploit” combined with names of software components such as “Apache Commons”. Apart from Twitter information, a local mirror of a CVE database, such as NVD, is stored. This database is used to categorize security in-formation obtained from Twitter, in particular to distinguish new inin-formation from the repetition of already known vulnerability information. The data processing part identifies, evaluates and classifies the security

(16)

information retrieved from Twitter. Using data-mining algorithms, the data is processed. Each algorithm is implemented by a so-called analyzer. An element of this system is a pre-processor that filters out duplicate tweets or content not meeting certain criteria.

To detect zero-day vulnerability information, the authors identify clusters of information relating to the same issue of some software component and contains specific vulnerability keywords.

The prototype tool conducts a Twitter search by identifying information matching the regular expression “CVE-*-” to obtain all the messages dealing with CVEs. After this, the messages are grouped by CVE identifier in order to obtain clusters of messages dealing with the same CVE. From these clusters the authors extract the common keywords in order to identify the manifestation of the vulnerability.

Furthermore, the result of an empirical study that compares the availability of information published through Social Media (e.g.Twitter) and classical sources (e.g. the NVD) is presented. The authors have con-ducted two studies that compare the freshness of the data collected compared to the traditional sources. The first study concerns the comparison between the publication date of CVEs in the NVD and the publication date on social media. 41% of the CVEs were discussed on Twitter before they were listed in the NVD. The second study investigates the publication date of zero-day vulnerabilities on social media relative to the date of publication for the related CVE in the NVD. 75,8% of the CVEs vulnerabilities where disclosed on social media before their official disclosure in the NVD.

The research conducted byTrabelsi et al.(2015) relates to our topic because we might also want to use un-conventional (i.e. not CVE databases) sources to either obtain new vulnerability information or complement existing vulnerability data.

2.7 An Analysis of Dependence on Third-party Libraries in Open

Source and Proprietary Systems

Raemaekers et al.

(

2012a

)

At present there is little insight into the actual usage of third-party libraries in real-word applications as opposed to general download statistics. The authors of this paper seek to identify the frequency of use of third-party libraries among proprietary and open source systems. This information is used to derive a rating that reflects the frequency of use of specific libraries and the dependence on third-party libraries. The rating can be employed to estimate the amount of exposure to possible security risks present in these libraries.

To obtain the frequency of use of third-party libraries, import and package statements are extracted from a set of Java systems. After processing the import and package statements, a rating is calculated for individual third-party libraries and the systems that incorporate these libraries. The rating for a specific library consists of the number of different systems it is used in divided by the total number of systems in the sample system set. The rating for a system as a whole is the sum of all ratings of the libraries it contains, divided by the square of the number of libraries.

The authors hypothesize that when a library is shown to be incorporated frequently in multiple systems there must have been a good reason to do so. The reasoning behind this is that apparently a large number of teams deems the library safe enough to use and therefore have made a rational decision to prefer this library over another library offering similar functionality. It is assumed that people are risk-averse in their choice of third-party libraries and that people therefore tend to prefer safer libraries to less safe ones. The authors thus exploit the collective judgment in the rating.

Raemaekers et al.(2012a) also assume that the more third-party library dependencies a system has, the higher the exposure to risk in these libraries becomes. The analysis shows that frequency of use and the number of libraries used can give valuable insight in the usage of third-party libraries in a system.

The final rating devised ranks more common third-party libraries higher than less common ones, and systems with a large number of third-party dependencies get rated lower than systems with less third-party dependencies.

This paper relates to our topic because the rating derived may correlate with the secureness of a library or system as a whole; if a lot of obscure dependencies are used by the system it could be considered to be less safe. However, this assumption does not necessarily hold in all cases because a popular library may attract more attention from hackers and thus is a more attractive target to exploit than less commonly used libraries.

(17)

2.8 Exploring Risks in the Usage of Third-Party Libraries

Raemaekers et al.

(

2011

)

Using software libraries may be tempting but we should not ignore the risks they can introduce to a system. These risks include lower quality standards or security risks due to the use of dependencies with known vulnerabilities. The goal of this paper is to explore to what extent the risks involved in the use of third-party libraries can be assessed automatically. A rating based on frequency of use is proposed to assess this. Moreover, various library attributes that could be used as risk indicators are examined. The authors also propose an isolation rating that measures the concentration and distribution of library import statements in the packages of a system. Another goal of this paper is to explore methods to automatically calculate such a rating based on static source code analysis.

First, the frequency of use of third-party libraries in a large corpus of open source and proprietary software systems is analyzed. Secondly, the authors investigate additional library attributes that could serve as an indicator for risks in the usage of third-party libraries. Finally, the authors investigate ways to improve this rating by incorporating information on the distribution and concentration of third party library import statements in the source code. The result is a formula by which one can calculate the the rating based on the frequency of use, the number of third-party libraries that a system uses and the encapsulation of calls to these libraries in sub-packages of a system.

The rating for a specific library that the authors propose in this paper is the number of different systems it is used in divided by the total number of systems in the data set. The rating for a system is the average of all ratings of the libraries it contains, divided by the number of libraries.

Risks in the usage of third party libraries are influenced by the way a given system is using a specific library. In particular, the usage can be well encapsulated in one dedicated component (which would isolate the risk), or scattered through the entire system (which would distribute risk to multiple places and makes it costly to replace the library).

When a library is imported frequently in a single package but not frequently imported in other packages, this would result in an array of frequencies with a high ’inequality’ relative to each other. Ideally third-party imports should be imported in specific packages dealing with this library, thus reducing the amount of ’exposed’ code to possible risks in this library.

This paper describes an approach to use the frequency of use of third-party libraries to assess risks present in a system. With this data, an organization can have insight into the risks present in libraries and contemplate on necessary measures or actions needed to be taken to reduce this risk.

This paper relates to our topic because the API usage is used as a proxy for potential vulnerability risk. In the system we propose we seek to determine whether vulnerable APIs are called.

2.9 Measuring Software Library Stability Through Historical

Ver-sion Analysis

Raemaekers et al.

(

2012b

)

Vendors of libraries and users of the same libraries have conflicting concerns. Users seek backward com-patibility in libraries while library vendors want to release new versions of their software to include new features, improve existing features or fix bugs. The library vendors are constantly faced with a trade-off be-tween keeping backward compatibility and living with mistakes from the past. The goal of this paper is to introduce a way to measure interface and implementation stability.

By means of a case study, several issues with third-party library dependencies are illustrated: • It is shown that maintenance debt accumulates when updates of libraries are deferred.

• The authors show that when a moment in the future arrives where there is no choice but to update to a new version a much larger effort has to be put in than when smaller incremental updates are performed during the evolution of the system.

(18)

• It is shown that the transitive dependencies libraries bring along can increase the total amount of work required to update to a new version of a library, even if an upgrade of these transitive dependencies was originally not intended.

• The authors show that a risk of using deprecated and legacy versions of libraries is that they may contain security vulnerabilities or critical bugs.

The authors propose four metrics that provide insight on different aspects of implementation and interface stability. Library (in)stability is the degree to which the public interface or implementation of a software library changes as time passes in such way that it potentially requires users of this library to rework their implementations due to these changes.

This study illustrates one of the reasons a systems’ dependencies are often not kept up to date. We may uti-lize these metrics in our research to indicate how much dependencies interfaces have been changed between the currently used version and a new version containing security improvements. This indication provides an estimation for the amount of time needed to update to a newer release of a dependency.

2.10 An Empirical Analysis of Exploitation Attempts based on

Vul-nerabilities in Open Source Software

Ransbotham

(

2010

)

Open source software has the potential to be more secure than closed source software due to the large number of people that review the source code who may find vulnerabilities before they are shipped in the next release of a system. However, when considering vulnerabilities identified after the release of a system, malicious persons might abuse the openness of its source code. These individuals can use the source code to learn about the details of a vulnerability to fully exploit it; the shadow side of making source code available to anyone.

Open source software presents two additional challenges to post-release security. First and foremost, the open nature of the source code eliminates any benefits of private disclosure. Because changes to the source code are visible, they are publicly disclosed by definition, making it easy for hackers to figure out how to defeat the security measures.

Many open source systems are themselves used as components in other software products. Hence, not only must the vulnerability be fixed in the initial source, it must be propagated through derivative products, released and installed. These steps give attackers more time, further increasing the expected benefits for the attacker.

In conclusion, when compared to proprietary dependencies, open source dependencies have a greater risk of exploitation, diffuse earlier and wider and have greater overall volume of exploitation attempts.

Using open source libraries brings along additional security risks due to their open character. Vulnerabili-ties in these libraries, even when they are patched, propagate to other systems incorporating these libraries. Since the effort to exploit a system decreases due to the availability of the source code, it is paramount that early warnings are issued and distributed upon discovery of a vulnerability. The latter can be accomplished by the tool we propose. This way, owners can limit the exploit-ability of their system. Therefore, this research emphasizes why our area of research is so important.

2.11 Understanding API Usage to Support Informed Decision

Mak-ing in Software Maintenance

Bauer and Heinemann

(

2012

)

The use of third-party libraries has several productivity-related advantages but it also introduces risks — such as exposure to security vulnerabilities — to a system. In order to be able to make informed decisions, a thorough understanding of the extent and nature of the dependence upon external APIs is needed.

(19)

• APIs keep evolving, often introducing new functionality or providing bug fixes. Migrating to the latest version is therefore often desirable. However, depending on the amount of changes — e.g. in case of a major new release of an API — backward-compatibility might not be guaranteed.

• An API might not be completely mature yet. Thus, it could introduce bugs into a software system that may be difficult to find and hard to fix. In such scenarios it would be beneficial to replace the current API with a more reliable one as soon as it becomes available.

• The provider of an API might decide to discontinue its support, such that users can no longer rely on it for new functionality and bug fixes.

• The license of a library or a project might change, making it impossible to continue the use of a par-ticular API for legal reasons.

These risks are beyond the control of the maintainers of a system that are using these external APIs but they do need to be taken into account when making decisions about the maintenance options of a software system. Tool support is therefore required to provide this information in an automated fashion. Bauer and Heinemann (2012) devise an approach to automatically extract information about library usage from the source code of a project and visualize it to support decision-making during software maintenance. The goal is determining the degree of dependence on the used libraries.

This paper is related to our topic in the sense that the tool we will devise could be used to provide insight to the effort required to update a vulnerable dependency to a newer version once it has been discovered.

(20)

Chapter 3 Research method

In this chapter we explain the research method we will employ during our research. The goal of this chapter is to explain our instantiation of Technical Action Research.

3.1 Introduction

In this thesis TAR will be employed as proposed byWieringa and Morali(2012). TAR is a research method in which a researcher evaluates a technique by solving problems in practice employing the technique. Findings can be generalized to unobserved cases that show similarities to the studied case.

In TAR, a research fulfills three roles: I Artifact designer

II Client helper III Empirical researcher

The technique is first tested on a small scale in an idealized “laboratory” setting and is then tested in increas-ingly realistic settings within the research context, eventually finishing by making the technique available for use in other contexts to solve real problems.

Before a suitable technique can be developed, improvement problems should be solved and knowledge ques-tions answered. An improvement problem in this case could be: “How can we assess actual exposure to vulnerabilities in automated fashion?”. Knowledge problems are of the form “Why is it necessary to deter-mine actual exposure to vulnerabilities?” or “What could be the effect of utilizing this technique in practice?”. To solve an improvement problem we can design treatments. A treatment is something that solves a prob-lem or reduces the severity of it. Each plausible treatment should be validated and one should be selected and implemented. A treatment consists of an artifact interacting with a problem context. This treatment will be inserted into a problem context, with which it will start interacting. In our case the treatment consists of a tool incorporating the technique we proposed before used to fulfill some goal. Treatments can be vali-dated by looking at their expected effects in context, the evaluation of these effects, expected trade-offs and sensitivities.

It is necessary to determine actual exposure to vulnerabilities because the maintainers of a system often neglect to keep their dependencies update due to a presumed lack of threat. A tool that points out that the perceived sense of security is false to the complacent maintainers would stimulate them to take action; after all, once they know of the threat, so do large numbers of others with less honorable intentions.

The effect of this would be that a systems’ dependencies are kept up to date better, which should lead to improved security. This is also expected to lead to improved maintainability of a system. This can be substantiated by arguing that the more time has passed since a dependency is last updated, the more effort it takes to upgrade. The reason being that the public API of a dependency evolves, and as more time passes and more updates are released the API might have changed so dramatically that its almost impossible to keep up.

(21)

Generalization of solutions in TAR is achieved by distinguishing between particular problems and problem classes. A particular problem is a problem in a specific setting. When abstracted away from this setting, a particular problemmay indicate the class of problems it belongs to. This is important because the aim of conducting this research is to accumulate general knowledge rather than case-specific knowledge that does not apply in a broader context.

In the next sections we will explain our instantiation of three cycles, each one belonging to a specific role (client helper, empirical researcher, artifact designer) the researcher fulfills.

3.2 Client helper cycle

3.2.1 Problem investigation

SIG offers security-related services to its clients. As part of this value proposition, the Vulnerability Alert Service (VAS) tool has been devised. Although the tool is considered to be useful, it also generates a lot of false positives. More importantly, SIG consultants need to manually verify each reported vulnerability to see whether the vulnerability could impact the system of the client. This is based on the consultant’s knowledge of the part of the dependency the vulnerability is contained in and how this dependency is used in the system. An issue is that this assessment is not foolproof due to the fact that it relies on the consultant’s knowledge of the system, which may be incomplete. A better option would be to completely automatically assess whether vulnerable code may be executed without the involvement of humans.

SIG also provides its clients with services to assess the future maintainability of a system. When depen-dencies are not frequently updated to newer versions it will require considerably more effort in the future to integrate with newer versions of the dependency due to API changes. As discussed in the introduction, the reason for not updating may be attributed to the anxiety of introducing new bugs when doing so. If any of the used dependencies are known to have security vulnerabilities, the maintainers of such systems have to be convinced of the urge to update to a newer version to mitigate the vulnerability. Maintainers may think that they are not affected by a known vulnerability based on their judgement. This judgement may be poor. Automatic tooling could be employed to convince these maintainers of the urge to update when it can be shown that vulnerable code is likely executed. If the tool indicates the system is actually exposed to the vulnerability, the dependency will likely be updated, which may improve the long-term maintainability of the system because the distance between the latest version of the dependency and the used dependency decreases. In turn, this makes it easier to keep up to date with breaking API changes when they occur rather than letting them accumulate. Hence, our tool might also be useful from a maintainability perspective.

We have identified an approach that could be used to fulfill this need. We will design a tool that incorporates such functionality and appraise whether this tool can be exploited in useful ways for SIG. Table3.1shows the stakeholders that are involved in the SIG context along with their goals and criteria.

Stakeholder Goals Criteria

SIG Add value for clients by actively

monitor-ing exposure to known vulnerabilities The tool should aid in system security as-sessments conducted by consultants at SIG. The number of false positives reported should be minimized, as this may lead to actual threats going unnoticed in the noise. Clients should consider any findings of the tool useful and valuable.

SIG’s clients Tool allows clients to take action as soon as

possible when new threats emerge. Less exposure to security threats.

Improved maintainability of the sys-tem.

(22)

CHAPTER 3. RESEARCH METHOD

3.2.2 Treatment design

Using the artifact (proof-of-concept tool) and the context (SIG) we can devise multiple treatments:

I Tool indicates actual exposure to vulnerability in library → client updates to newer version of depen-dency → security risk lowered and dependepen-dency lag reduced. This treatment contributes to the goals in that the security risk of that specific system is lowered and the maintainability of the system is improved. II Tool indicates actual exposure to vulnerability in library → client removes dependency on library or replaces with another library having the same functionality. This treatment might lessen the immediate security risk, but another library might have another risk. The dependency lag with a new dependency could remain stable but it can also change negatively or positively depending on the dependency lag of the new dependency.

3.2.3 Design validation

The effect we expect our tool to accomplish is improved awareness of exposure to vulnerabilities on the part of both stakeholders. The resulting value for the the client is that they are able to take action and therefore improve the security of the system. Awareness leads to reduced dependency lag and thus leads to improved maintainability. Even if the use case of the tool shifts within SIG, the artifact is still useful because it can be used in both security-minded contexts and maintainability-minded contexts.

3.2.4 Implementation and Implementation evaluation

The proof-of-concept is used to analyze a set of client systems. We will investigate one client system for which a security assessment is ongoing and schedule an interview with the involved SIG consultants to discover whether our tool supports their work and ultimately adds value for the client.

3.3 Research cycle

3.3.1 Research problem investigation

The research population consists of all clients of SIG having systems with dependencies as well as SIG con-sultants responsible for these systems.

The research question we seek to answer by using TAR is: “Can the results of a tool implementing the proposed technique be exploited in useful ways by SIG? Useful in this case denotes that the results will add value for SIG and its clients”.

We know that the current VAS tool currently used at SIG was already considered to be useful when it was delivered. Therefore it would be the most relevant to assess what makes the tool more useful than VAS.

3.3.2 Research design

The improvement goal in the research context is to extend or supplement the current VAS tool to assess actual exposure to vulnerabilities, then monitor the results and improve them if possible. We have chosen to proceed with the first (I) treatment (refer to client helper cycle). This treatment is preferred as it satisfies two goals at the same time as opposed to the second (II) treatment.

The research question will be answered in the context of SIG. Data is collected by first obtaining analysis results from the tool we propose, then discussing analysis results with SIG consultants or clients. Based on this data we seek to assess which components contribute to the perceived usefulness.

The results are expected to be useful from at least from a maintainability and security perspective. Hence, it is expected that in other contexts, the results are deemed useful as well in these or other perspectives.

3.3.3 Research design validation

We expect that our tool can serve various purposes in different contexts. It should be noted that a human would also be able to assess actual exposure to vulnerabilities. However, as the average number of

(23)

depen-Stakeholder Goals Criteria Maintainers of systems with dependencies Improve system

maintainabil-ity and securmaintainabil-ity by actively monitoring exposure to known vulnerabilities.

Use of tool should lead to re-duced dependency lag and thus less maintainability-related prob-lems. Not too much false positives reported.

Companies/entities with internal systems Lessen security risk of these

inter-nal systems. Not too much missed vulnerabili-ties (false negatives) leading to a false sense of security.

Researchers Utilize actual vulnerability expo-sure data in research in order to make some conclusion based on this data.

Accuracy of reported exposure to vulnerabilities.

Third-party service providers Deliver a security-related service to

clients. Scalability and versatility of solu-tion. Table 3.2: Stakeholders in the general context and their goals and criteria.

dencies used in a system increases, manual examination would only be feasible for systems with little depen-dencies.

The research design allows us to answer the research question as the tool can be used by consultants at SIG in real client cases. As these consultants actually use the tool to aid in an assessment, they are likely to provide meaningful feedback.

We have identified the following potential risks that may threaten the results obtained in the research cycle:

• SIG clients’ systems use uncommon libraries (no CVE data available). • SIG clients’ systems use only proprietary libraries (no CVE data available). • Perceived usefulness significantly varies per case.

• There is no perceived usefulness. However, in that case we could look at which elements do not con-tribute to the usefulness and try to change them.

• The VAS system we rely on for CVE detection does not report any vulnerabilities while those are present in a certain library (false negatives).

3.3.4 Analysis of results

We will execute the client helper cycle. Then, we evaluate the observations and devise explanations for un-expected results. Generalizations to other contexts are hypothesized and limitations noted. We will dedicate a separate chapter to this.

3.4 Design cycle

3.4.1 Problem investigation

The currently tooling available to detect known vulnerabilities in the dependencies of a system does not assess actual exposure to these vulnerabilities. We plan to develop a tool that is actually able to do this. In Table3.2we list a number of stakeholders that could potentially be users of this tool in external contexts.

By observing the phenomena we can conclude that there is a need for tooling to aid in the detection of dependencies that have known vulnerabilities.

• Up to 80 percent of code in modern systems originates from dependencies (Williams and Dabirsiaghi,