• No results found

Measuring the In uence of Process Automation on the Productivity of Software Development Teams: A Case Study of Proprietary Projects

N/A
N/A
Protected

Academic year: 2021

Share "Measuring the In uence of Process Automation on the Productivity of Software Development Teams: A Case Study of Proprietary Projects"

Copied!
75
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Measuring the

Influence of Process Automation

on the Productivity of

Software Development Teams

A Case Study of Proprietary Projects

Pepijn van de Kamp

pepijnvandekamp@gmail.com

February 15, 2015, 70 pages

Supervisors: dr. Magiel Bruntink, Universiteit van Amsterdam Wouter de Kort, Ordina

Host organisation: Ordina,http://www.ordina.nl

Universiteit van Amsterdam

(2)

Contents

Abstract iii 1 Introduction 1 1.1 Problem Statement . . . 1 1.2 Research Objectives . . . 1 1.3 Central Question . . . 2 1.4 Research Method . . . 2 1.5 Contributions . . . 2 1.6 Thesis Outline . . . 2 2 Background 4 2.1 Productivity in Software Engineering . . . 4

2.2 Process Automation in Software Engineering . . . 5

2.3 Lean Manufacturing in Software Engineering . . . 5

2.4 Technical Quality of a Software Product . . . 7

2.5 Issue Handling Efficiency . . . 9

2.6 Software Process Mining . . . 12

2.7 Summary . . . 13

3 Case Study Design 14 3.1 Problem Analysis . . . 14

3.1.1 Process Automation Tool Usage . . . 15

3.1.2 Measuring Changes in the Software Process . . . 15

3.1.3 Quantifying Issue Handling Efficiency . . . 16

3.1.4 Measuring Technical Quality . . . 17

3.2 Case Selection and Description . . . 17

3.2.1 Selection Criteria . . . 17

3.2.2 Case Description . . . 17

3.2.3 Project A . . . 18

3.2.4 Project B . . . 18

3.2.5 Project C . . . 19

3.3 Data Collection and Processing . . . 19

3.3.1 Process Automation tooling usage . . . 19

3.3.2 Measuring Process Changes . . . 20

3.3.3 Quantifying Issue Handling Efficiency . . . 22

3.3.4 Measuring Technical Quality . . . 23

3.4 Threats to Validity . . . 24

3.4.1 Construct Validity . . . 24

3.4.2 Internal Validity . . . 25

3.4.3 External Validity . . . 25

3.4.4 Reliability . . . 26

4 Results & Analysis 27 4.1 Process Automation tool usage . . . 27

(3)

Contents

4.1.1 Project A . . . 27

4.1.2 Project B . . . 28

4.1.3 Project C . . . 28

4.2 Process Changes . . . 29

4.2.1 Overview of the dataset . . . 29

4.2.2 Duration Metrics . . . 29

4.2.3 Cumulative Flow Diagrams . . . 32

4.2.4 Issue Churn Views . . . 33

4.2.5 Process Mining . . . 35

4.3 Issue Handling Efficiency . . . 38

4.3.1 Risk Profile . . . 39

4.3.2 Trends in Issue Resolution Speed . . . 39

4.4 Technical Quality . . . 40

4.4.1 SIG Maintainability Model . . . 40

4.4.2 Relation between Issue Resolution Speed and Maintainability . . . 41

5 Discussion 43 5.1 Process Automation Tool Usage . . . 43

5.2 Measuring Process Changes . . . 43

5.2.1 Duration Metrics . . . 44

5.2.2 Cumulative Flow Diagrams . . . 45

5.2.3 Issue Churn View . . . 46

5.2.4 Process Mining . . . 46

5.3 Quantifying Issue Handling Efficiency . . . 47

5.3.1 Quantifying Defect Resolution . . . 47

5.3.2 Quantifying Enhancement Resolution . . . 48

5.4 Measuring Technical Quality . . . 49

5.4.1 Relation between Maintainability and Issue Handling Efficiency . . . 50

5.5 Measuring the Influence of Process Automation on Team Productivity . . . 51

5.6 Evaluation of Validity . . . 51

6 Conclusions and Future work 53 6.1 Summary of Conclusions . . . 53

6.2 Future Work . . . 54

Appendices 56

A Case Study Protocol 57

B Implementation of SIG Maintainability Model 64

(4)

Abstract

Software Engineering Tool vendors promote the use of Application Life-cycle Management (ALM) tool suites to increase the productivity of teams and the quality of products. One of the pillars of ALM tooling is Process Automation, which enables the automation of processes between a code commit and a deployment of a new product version. In this study, a set of existing approaches of measuring changes in process and product with data from Issue Tracker Systems and Version Control Systems is evaluated for the use of measuring changes in productivity. Productivity in Software Engineering is influenced by many confounding factors. The factor of technical product quality was controlled by an implementation of the SIG Maintainability Model. The factor of process maturity was attempted to be controlled by using techniques from process mining. Furthermore, the investigated projects only implemented a limited level of Process Automation during the time-frame of this research. Therefore, the performed measurements and constructed visualizations are influenced less by Process Automation usage. However, the measurements and visualizations used in this research are evaluated for the use of measuring the process changes that are expected from Process Automation tool usage. Based on these evaluations, a set of recommended changes for future research on measuring the influence of process automation on the productivity of software development teams is presented. This study suggests that a combination of the Cumulative Flow Diagram, Issue Churn View and the quantification of issue handling efficiency calibrated with a representative benchmark is suitable for measuring the expected influence of process automation on the productivity of a software development team within the considered perspectives, despite the limited use of process automation as primary limitation.

(5)

Preface

February 15, 2015 Rotterdam, The Netherlands

This thesis is submitted in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering, at the University of Amsterdam. This research was carried out at Ordina Software Development (The Netherlands) in the time-frame from September 2014 to February 2015. The project was supervised by dr. Magiel Bruntink from the University of Amsterdam and Wouter de Kort from Ordina Software Development.

The results of this project would not have been possible without the help of several people. I would like to express my gratitude to dr. Magiel Bruntink for his support, guidance, inspiring words and sharp questions throughout this project. I would also like to thank Wouter de Kort for his feedback, guidance and pragmatic approach. I would also like to thank Bart van den Berg in his role of vice president of Ordina Software Development for hosting and sponsoring this research project. Special thanks goes to the team members of the projects investigated in this research for providing data and helping to relate my findings to their projects. I would also like to thank my colleagues from the Ordina ALM Competence Center, Ordina Maintenance & Outsourcing, Ordina Risk Management Services and the Ordina Pricing Office for their interest in this research.

I am grateful to my partner, family and friends for believing in me and supporting me while fulfilling this research project.

(6)

Chapter 1

Introduction

1.1

Problem Statement

Suppliers of software development tooling, like Microsoft and IBM, promote the use of Application Life-cycle Management (ALM) tooling to their customers to improve the productivity of software teams and the quality of software products. One of the pillars of Application Life-cycle Management is the automation of processes during development to enable earlier feedback. In order to provide better consultancy to clients it is necessary to have a better understanding of how ALM tooling contributes to increased productivity and quality. It is also desirable to be able to provide clients with references to prior ALM implementations and their results as well as provide an estimate on the expected return on investment. To maximize the potential of an ALM implementation it is necessary to monitor and control the progress on the aspects of process automation and its effect on the productivity of the team and the quality of the product.

1.2

Research Objectives

This research focuses on the Process Automation aspect of Application Life-cycle Management tooling. Many businesses evolve around IT systems these days. The sooner these IT systems deliver value, the earlier do these IT systems deliver a return on investment. Process Automation tooling, like Microsoft’s ALM tool suite, claims to contribute to increased productivity of software teams and an increased quality of the product. This increase in productivity and quality suggests that Process Automation tooling will support development teams in delivering more and earlier value of higher quality to their clients and hereby save money and resources. It is therefore relevant to investigate this claim and research how and if the introduction of Process Automation could lead to higher productivity of software teams.

The objective of this research is to propose a set of metrics and visualizations from various quality perspectives to (1) gain insight in the efficiency of the current Software Development Process, (2) identify parts in the process that could benefit from Process Automation tooling and (3) to measure trends in the productivity of software teams and the quality of software products to monitor and control the implementation of Process Automation tooling.

(7)

Chapter 1: Introduction

1.3

Central Question

To gain insight in the productivity gains that are expected when process automation tooling is intro-duced in a software development process, it is necessary to identify approaches that measure changes in productivity. The central question of this research is stated as follows:

How to measure the influence of process automation on the productivity of a software development team?

During the course of this thesis the central question will be refined in a set of research questions in the design of this research.

1.4

Research Method

An exploratory case study will be used as a research method. This study will be conducted at Ordina Software Development, a large supplier of software expertise in The Netherlands. The units of analysis within this case study are three proprietary software projects. These projects will be investigated in a time-frame of four months during the implementation of Microsoft ALM tooling in the software development process.

Existing literature on productivity and process automation in Software Engineering will be studied to select a set of metrics and visualizations to gain insight into various perspectives of productivity. From these existing approaches a research design will be constructed to evaluate the use of these approaches for measuring the influence of process automation on productivity.

1.5

Contributions

The following contributions are presented in this study:

• A set of metrics and visualizations is presented that are expected to aid in measuring and controlling the influence of process automation on the productivity of a software development team.

• The metrics and visualizations aid in gaining insight in the current efficiency of a software development process.

• The metrics and visualizations aid in the identification of parts in the software process that could benefit from process automation tooling.

1.6

Thesis Outline

The outline of this thesis is structured as follows:

• Chapter 2 discusses background information related to the design of this study. • Chapter 3 describes the research design of this case study.

(8)

Chapter 1: Introduction

• Chapter 5 discusses how the findings of this research answer the research questions and evaluates the validity of the findings.

(9)

Chapter 2

Background

This chapter describes existing work related to the research conducted in this study. The descrip-tions of the existing work provide the necessary background information on theories, techniques and approaches used in this research.

2.1

Productivity in Software Engineering

The continuous improvement of productivity is a goal that is pursued by many industries, including the software engineering industry. The term productivity can be defined as an efficiency measure for the ratio of units of output over units of input in a certain production process [46]. In contrast with other industries, there is no common understanding yet in the software engineering industry on how to measure the output of the software development process.

Various methods on how to measure productivity in software engineering have been proposed in the past decades. Early studies on productivity in software engineering focused on measuring the lines of code (LOC) produced per unit of time [69]. This method led to problems when comparing the productivity of projects using different programming languages. A method to compare productivity between projects is the Function Point Analysis, which was introduced by Albrecht [1]. A disadvantage of this method is that it is time-consuming and expert knowledge is required to be able to count function points and interpret the results of a Function Point Analysis.

Software engineering work entails both thinking and learning about the problem domain, which can not be expressed nor counted in measures, like Lines Of Code or Function Points. In Software Engineering, it is desirable to produce the fewest lines of code possible in order to solve a problem. Therefore, increased productivity can not be linked to the increased production of code, which is counter-intuitive compared to other industries. The fact that most of the output of a software engineering process is not tangible makes it hard to quantify productivity. Boehm [12, 13] and Jones [31] identified that the time needed to resolve a specific unit of work in software engineering depends on a large number of factors that vary per project. Therefore, it is necessary to identify project-specific factors that influence productivity, and take these factors into account when analyzing productivity output measurements. A structured literature review of factors that influence productivity in software engineering is presented by Wagner et al. [68].

(10)

Chapter 2: Background

2.2

Process Automation in Software Engineering

The term Process Automation is a collective noun used in the software engineering industry for continuous practices, like Continuous Integration, Continuous Testing and Continuous Deployment [8, 25] (among others). The aim of Process Automation is to automate as many steps as possible between the check-in of a code change by a developer and the release of a new product version for the customer. As a result, this approach leads to (1) rapid feedback to developers on code check-ins, (2) continuous delivery of value to customers, (3) less manual work (and thus less risk) involved in the deployment process [20, 25, 26].

Process Automation tooling is often part of integrated Application Life-cycle Management (ALM) tool suites, like (among others) Microsoft Visual Studio ALM1[49, 50] or IBM Rational Jazz2[21]. A

recent Software Development practice that stresses Process Automation is DevOps [20, 66]. Vendors of ALM tool suites promoting both Process Automation and DevOps practices promise increased software quality and increases of the productivity of software teams of up to 50% (see vendor websites

1 and2and vendor technical report [66]).

From a scientific perspective it is necessary to validate these claims with empirical research. Until now little scientific evidence has been presented to support these claims from industry. A possible reason for this is that it is difficult to control other factors that influence productivity and quality. From industry perspective it is necessary to develop tools and visualizations that could aid software development teams in identifying process areas that could benefit from Process Automation and hereby enabling further process improvement.

2.3

Lean Manufacturing in Software Engineering

Fitzgerald et al. argue in [20] that the various continuous practices in software engineering have strong links with the Lean Manufacturing school of thought. Fitzgerald et al. [20] propose Lean Manufacturing as a useful concept for assessing Process Automation practices in software engineering.

Lean Manufacturing is a school of thought derived from the Japanese car industry. The term was first coined by Krafcik [32]. It comprises a set of principles and tools that aim to eliminate waste within a manufacturing process, which is created through unevenness in the flows of work [40, 70]. Lean manufacturing provides a different perspective on the term productivity: It considers the value that is created for the client as the output of the manufacturing process.

The opportunities for the use of Lean Manufacturing principles and tools in a software engineering context have been illustrated by Poppendieck et al. [45]. Poppendieck et al. describe the similarities and differences of Lean principles with Agile principles and how the two paradigms can complement each other. What distinguishes Lean from Agile is the focus on the end-to-end perspective of the whole flow of value through development [45]. All activities that do not directly contribute to added product value are considered waste and should therefore be eliminated or improved. From this perspective, Process Automation can be seen as a tool that eliminates waste in the software development process by automating manual repeated work that does not directly add value to the product under development [20].

Lean Manufacturing proposes a set of duration metrics to gain insight in the time that is spent on value adding activities and the time that is spent on waiting [45]:

• Waiting Time: The time a unit of production is queued in the process and no actual work is

1http://www.visualstudio.com/en-us/explore/app-lifecycle-management-vs.aspx

(11)

Chapter 2: Background

performed nor value is added.

• Service Time: The time a unit of production is being worked on and value is added.

• Lead Time: The time it takes a unit of production to get from one end of a process to the other.

Another metric used in Lean Manufacturing practices is the Value-added ratio, as proposed by Shingo [56]:

• Value-added ratio: The ratio of the average time spent on value-added work (Service Time) to the total average time a unit of production is in the process (Lead Time) [56].

The goal of Lean Manufacturing is to optimize the flow of work through the process, reduce waste, compress lead times and deliver early value to clients [26, 45]. When the flow of work through the different phases of a process is not evenly spread this could lead to bottlenecks in a process, and thus to an increase of lead times. Reinertsen states in [48] that the Cumulative Flow Diagram (CFD) is a tool to visualize what happens in a queuing system and quickly determine how our decisions impact queues in the process. A CFD is a stacked area graph that depicts the quantity of work in a given process state over time [4, 5]. The use of these diagrams to visualize the flow of work in software development with data from issue tracker systems was illustrated in [30, 42, 43, 44]. An example of a CFD for a requirements engineering process is depicted in Figure 2.1.

A CFD can be constructed by plotting the cumulative arrivals of units of production in the states of a process (Y axis) versus the time (X axis). From such diagrams the following observations can be made:

• The slope of a line shows the capacity of the underlying process stage [48].

• Horizontal distances tell how much time an individual unit spends in a particular process stage [48].

• Vertical distances are the quantities of the units of production in a certain process stage at a certain time [48].

• Differences in the steepness between process stages could indicate bottlenecks and inefficiencies in a process [44].

• The slope of all cumulative items shows the inflow of units of production over time. [44]. • The CFD shows how much work is still to be done (inventory) and how much work is in the

process (work in progress) over time [5].

This visualization aids with the identification of bottlenecks and problems in the hand-overs of work between the phases. Therefore, CFDs provide valuable input for process improvements and could possibly aid in identifying process steps that could benefit from Process Automation.

(12)

Chapter 2: Background

Figure 2.1: Example of a Cumulative Flow Diagram (picture taken from [44])

2.4

Technical Quality of a Software Product

From the perspective of productivity it is expected that source code of low quality has a negative impact on issue handling efficiency. This relationship has been studied by research conducted at the Software Improvement Group (SIG)3. SIG utilizes the SIG Maintainability Model to assess the

technical quality of software products. This model calculates a maintainability rating by mapping source code properties to sub-characteristics of the maintainability quality characteristic, as defined in the ISO/IEC 9126 international standard of Software Product Quality [27]. The Maintainability Model was first introduced by Heitlager et al. [23], calibrated by Alves et al. [2, 3], evaluated by Baggen et al. [6], and utilized in various studies, like [11, 19, 29, 36]. The model has recently been enhanced with metrics to evaluate implemented architectures, as described by Bouwers et al. [14]. Besides scientific evaluation, the model is also utilized by the SIG for the certification of the technical quality of the source code of software products. Various perspectives and extensions of the SIG Maintainability Model are described in [2, 3, 6, 14, 19].

To compare technical quality among systems, the model incorporates two levels of aggregation to convert individual source code metric values to a 5-point rating scale [6]. The first aggregation maps metric values to risk categories. This mapping results in a risk profile for the measured property of a software system, which represents the percentage of volume in that category. For the second level of aggregation, the risk profiles are aggregated to a rating on the 5 point scale by using cumulative rating thresholds. These rating thresholds for both the risk categories and risk profiles are calibrated on a yearly basis from benchmark data using the methodology as described by Alves et al. [2, 3]. Hereby, the benchmark-based approach of the SIG Maintainability Model differs from other technical quality models, like the Maintainability Index [41].

The yearly updated threshold values are not made publicly available by the SIG. However, threshold values have been published in research by the SIG for the following subset of source code metrics:

Volume

The total size of a system, measured in Lines Of Code (LOC), blank lines and comments are ignored [23]. (see Table 2.1).

(13)

Chapter 2: Background

Duplication

The percentage of redundant code in equal code blocks of at least 6 lines [6]. (see Table 2.2).

Unit Size

Lines of code per unit [2]. (see Table 2.3).

Unit Complexity

The cyclomatic complexity [38] of code per unit [6]. (see Table 2.4).

Unit Interfacing

The number of parameters declared in the interface of each unit [2]. (see Table 2.5).

Module Coupling

The number of incoming invocations per module [2]. (see Table 2.6).

The results for each metric are aggregated to a rating on a continuous scale between 0.50 and 5.50 by using the presented benchmark values and the interpolation function as described by Alves et al. [2]. To arrive at a maintainability rating, the rating for each metric is first mapped to a sub-characteristic of maintainability as defined in the ISO/IEC 9126 [27]. The mapping between the code metrics and sub-characteristics of maintainability is depicted in Table 2.7. A rating for each sub-characteristic is calculated by taking the mean of the mapped metric ratings. Finally, the maintainability rating is calculated by taking the mean of the ratings of the sub-characteristics.

Rating Man Years Java (KLOC) C# (KLOC) ? ? ? ? ? 0 - 8 0 - 66 0 - 64 ? ? ?? 8 - 30 66 - 246 64 - 240 ? ? ? 30 - 80 246 - 665 240 - 640 ?? 80 - 160 655 - 1310 640 - 1280 ? >160 >1310 >1280

Table 2.1: Volume threshold values [23, 67]

Rating Duplication ? ? ? ? ? 3% ? ? ?? 5% ? ? ? 10% ?? 20% ? >20%

Table 2.2: Duplication threshold values [6]

Rating Moderate Risk 33 - 44 LOC

High Risk 44 - 74 LOC

Very High Risk >74 LOC ? ? ? ? ? 19.5% 10.9% 3.9% ? ? ?? 26.0% 15.5% 6.5% ? ? ? 34.1% 22.2% 11.0% ?? 45.9% 31.4% 18.1% ? >45.9% >34.4% >18.1%

(14)

Chapter 2: Background

Rating Moderate Risk High Risk Very High Risk Cyclomatic Complexity 11 - 20 21 - 50 >50 ? ? ? ? ? 25% 0% 0% ? ? ?? 30% 5% 0% ? ? ? 40% 10% 0% ?? 50% 15% 5% ? >50% >15% >5%

Table 2.4: Unit Complexity threshold values [6]

Rating Moderate Risk High Risk Very High Risk

Parameters 2 3 >3 ? ? ? ? ? 12.1% 5.4% 2.2% ? ? ?? 14.9% 7.2% 3.1% ? ? ? 17.7% 10.2% 4.8% ?? 25.2% 15.3% 7.1% ? >25.2% >15.3% >7.1%

Table 2.5: Unit interfacing threshold values [2]

Rating Moderate Risk High Risk Very High Risk Dependencies 10 - 22 23 - 56 >56 ? ? ? ? ? 23.9% 12.8% 6.4% ? ? ?? 31.2% 20.3% 9.3% ? ? ? 34.5% 22.5% 11.9% ?? 41.8% 30.6% 19.6% ? >41.8% >30.6% >19.6%

Table 2.6: Module coupling threshold values [2]

Volume Duplication Unit Size Unit Complexity Unit Interfacing Module Coupling Analyzability X X X Changeability X X X Stability X X Testability X X

Table 2.7: Mapping of source code metrics to ISO/IEC 9126 sub-characteristics [6]

2.5

Issue Handling Efficiency

Scientific work on assessing the issue handling efficiency of a software development team has been presented by Luijten et al. [34, 35, 36]. Luijten et al. [35] propose the Issue Churn View (ICV) to provide a high-level perspective on the issue handling process. The ICV is a diagram that shows issue handling activities on a monthly basis. The X-axis represents time and the positive and negative values on the Y-axis present the number of submitted and resolved issues [35]. An example of an ICV is depicted in Figure 2.2.

(15)

Chapter 2: Background

• The number of opened issues. • The number of solved issues.

• The number of issues that were opened and solved. • The number of issues that were opened but not solved. • The number of solved backlog issues.

• The number of recent open issues in the backlog (<6 months). • The number of long-term issues in the backlog. (>6 months)

Figure 2.2: Example of an Issue Churn View (picture taken from [34])

Luijten et al. [34, 35] used the Issue Churn View for assessing issue handling efficiency in three open source projects and demonstrated its ability to visualize non-trivial aspects of the issue handling process. Issabayeva et al. [29] used ICVs to assess the issue handling process for three proprietary software projects and found notable differences in efficiency.

Luijten et al. [36] conducted an empirical study that demonstrated a significant, positive statistical correlation between the quality of software products as measured by the SIG Maintainability Model and the speed at which defects are solved by the development and/or maintenance teams of these products. For the experiment, issue tracker data and source code data from a range of open source projects was used. For each time interval between snapshot dates, the defects that were resolved in that interval were grouped. To measure defect resolution speed from issue tracker data, the following duration metric was used:

Resolution Time : The time between the moment an issue was submitted and the moment it was marked as being resolved and/or closed [36].

Luijten et al. [36] treat this metric as an indicator of the effort that was spent on solving the defect, in lack of more accurate data. Issue resolution times do not follow a normal distribution, but more a power law-like distribution. Therefore, taking the mean or median of defect resolution times is not appropriate. Luijten et al. [36] propose the use of risk categories (similar to the approach by Heitlager et al. [23]) and have mapped these risk categories to risk profiles. The threshold values for both the risk categories and risk profiles were calibrated from ITS data from open source projects.

(16)

Chapter 2: Background

The risk profile, which is the distribution of risk categories in a certain time interval, corresponds to a rating on a discrete scale from 1 to 5. The linear interpolation function as described by Alves et al. [2] is used to arrive at a continuous scale between 0.5 and 5.5. This continuous scale is achieved by interpolating the values of the risk profiles and the lower and upper thresholds for the risk categories [2].

The study by Luijten et al. [36] was replicated and extended by Bijlsma et al. [11]. Bijlsma et al. [11] calibrated new threshold values for the rating of both defect resolution speed and enhancement resolution speed. The study by Bijlsma et al. [11] showed strong significant correlations between the maintainability ratings (as calculated with the SIG Maintainability Model) and both defect resolution ratings and enhancement resolution ratings. Bijlsma et al. [11] published threshold values for the risk categories (see Table 2.8) and risk profiles (see Table 2.9) as used for the rating of defect resolution speed. Also, threshold values were published for the risk categories (see Table 2.10) and risk profiles (see Table 2.11) used for rating the enhancement resolution speed.

The study by Luijten et. al. [36] was also replicated by Issabayeva et al. [29] in a case study with three proprietary software projects. Issabayeva et al. [29] found no significant correlation on the combination of the datasets for all three projects. However, a significant correlation was found in the dataset of a single project with a high issue handling performance (rated 4.5 stars).

Category Thresholds

Low 0 - 28 days (4 weeks) Moderate 28 - 70 days (10 weeks) High 70 - 182 days (6 months) Very high 182 days or more

Table 2.8: Thresholds of risk categories for defect resolution times [11]

Rating Moderate High Very High ? ? ? ? ? 8.3% 1.0% 0.0% ? ? ?? 14% 11% 2.2% ? ? ? 35% 19% 12%

?? 77% 23% 34%

Table 2.9: Mapping of risk profiles to defect resolution ratings [11]

Category Thresholds

Low 0 - 152 days (6 months) Moderate 152 - 365 days (1 year) High 365 - 730 days (2 years) Very high 730 days or more

Table 2.10: Thresholds of risk categories for enhancement resolution times [11]

Rating Moderate High Very High ? ? ? ? ? 4.7% 0.0% 0.0% ? ? ?? 51% 6.4% 0.0% ? ? ? 75% 63% 12%

?? 75% 63% 34%

(17)

Chapter 2: Background

2.6

Software Process Mining

A challenge in software engineering is that systems become more and more complex. It is there-fore necessary to have well-defined processes that support the construction and delivery of complex software with high quality. Research by Boehm [12, 13] and Jones [31] identified Process Maturity as a significant factor for productivity in software engineering projects. It is therefore necessary to investigate the current maturity of a software process to find opportunities for process improvements to reduce lead times prior to the implementation of Process Automation.

A widely adopted method in industry that assesses the maturity of a software process is the Capability Maturity Model Integration (CMMI), published by the Software Engineering Institute [58]. A CMMI process assessment relies on information from interviews, oral audit sessions, quality manuals and process standard reviews [18]. Brodman et al. [17] argue that CMMI is too extensive to apply for smaller organizations and emphasize the need for a light-weight software process improvement approach for smaller organizations. ˇSamal´ıkov´a et al. [54] identified components within the CMMI assessment that could be aided by Process Mining techniques. An advantage of Process Mining is that it relies on objective data from event logs instead of the mostly subjective information gathering techniques of the CMMI assessment [54].

Process Mining is a process management technique that allows for the analysis of business processes based on event logs [61, 62]. Process Mining can also be conducted on the event logs of Software Configuration Management (SCM) systems for making the underlying engineering processes explicit in terms of one or more process models [52]. ˇSamal´ıkov´a et al. [54] identified two process mining techniques to complement a CMMI assessment with the analysis of event log data from SCM systems:

Process Model Discovery is a process mining technique that constructs a process model (the actual process) from a control-flow perspective by analyzing logged data from a runtime process [64].

Conformance Checking is a process mining technique that compares an existing process model (the perceived process) with an event log of a runtime process to analyze discrepancies between the log and the model [51].

A challenge in constructing process models from event logs is to balance between simplicity, precision, generalization and fitness. A model should allow for the behavior as captured in the event log, while not becoming overly complex or too general. The simplest model that can explain the behavior seen in the log is the best model [61]. This principle is known as Occam’s Razor in Process Mining literature [61]. Another challenge in Process Mining is when processes change while they are being analyzed. This phenomenon is known as Concept Drift in Process Mining literature [61].

In a Process Conformance Analysis a set of orthogonal metrics is used that all yield a value between 0 and 1 [51]. The level of conformance can be determined by taking the mean of the following metric values:

Fitness The extent to which the log traces can be associated with valid execution paths specified by the process model [51].

Behavioral Appropriateness The precision with which the observed process is described by the process model [51].

Structural Appropriateness The extent to which the model describes the observed process in a structurally suitable way [51].

Rozinat and van der Aalst note in [51] that a perceived conformance problem is two-fold: The perceived model may be assumed correct because it represents the way the process should be carried out, but

(18)

Chapter 2: Background

the event log may be assumed correct because it represents what really happened. A process model might either be outdated or not tailored to the needs of the employees performing the tasks [51]. A mature process is characterized by a high correspondence between how work is carried out and how a process is defined. A high level of process conformance is therefore an indicator of process maturity.

Gupta et al. [22] report on a similar approach of applying business process mining tools and techniques to analyze the event log data (bug report history) generated by an issue tracking system with the objective of discovering runtime process maps, inefficiencies and inconsistencies. Ramler et al. [47] provide a practical example of how custom Process Automation tooling can help in maintaining the conformance of software process artifacts. A case study of their approach showed a decrease in the average task duration of almost 70% in a period of 1 year after introducing process conformance checks [47].

2.7

Summary

This section presented an overview of existing scientific work related to the subject of this research. Work was presented related to productivity in software engineering, Process Automation, Lean Man-ufacturing, technical product quality, issue handling efficiency and software process mining.

(19)

Chapter 3

Case Study Design

This chapter presents the design of this research. First, the problem analysis motivates and describes the research questions and presents a solution outline for each question. Second, the case and subjects used in this case study are introduced. Third, the procedures for data collection and processing are elaborated. Finally, the threats to validity of this research design are discussed.

3.1

Problem Analysis

Tool vendors claim that Process Automation tooling increases the productivity of software devel-opment teams. However, from a scientific perspective it is necessary to validate these claims with empirical research. From an industrial perspective it is necessary to develop tools and visualizations that could aid software development teams in identifying process areas that could benefit from Process Automation and hereby enabling further process improvement.

The researched literature on productivity measurements in software engineering revealed that there is no common method to measure and quantify productivity (see section 2.1). Increases in productivity are expected to be reflected in the software process itself and the product under development. The studied literature presented approaches to measure aspects that are closely related to productivity. It is necessary to evaluate these existing approaches for the use of measuring the influence of Process Au-tomation on productivity. The existing approaches will be evaluated by following the implementation of Process Automation tooling in three proprietary software projects.

To research how the influence of Process Automation in the projects under investigation can be measured, the central question was introduced:

Central Question: How to measure the influence of Process Automation on the productivity of a software development team?

The central question will be refined to a set of research questions in the following sections. The pre-sented literature on Process Automation tooling (see section 2.2) suggests that Process Automation tooling leads to rapid feedback to developers on code commits, continuous delivery of value to cus-tomers and a reduction of risk in the delivery process due to less manual work. From these insights a hypothesis of expected process changes is derived:

H1 When Process Automation tooling is integrated within the process of a software development team, lead times are reduced because waste is eliminated from the process.

(20)

Chapter 3: Case Study Design

3.1.1

Process Automation Tool Usage

To reason on the influence of Process Automation in a project under research it is first necessary to investigate how the software development teams use Process Automation tooling in their project. RQ1 is presented to provide insight on how a software team uses Process Automation tooling to enable further reasoning on the influence of this tooling on productivity:

RQ1 Which process parts were automated by the software development teams after the introduction of Process Automation tooling?

This question can be answered by observing and interviewing the teams during the implementation of the Process Automation tooling. The findings can be verified by inspecting the database from the ALM tooling, which keeps track of all automated processes.

3.1.2

Measuring Changes in the Software Process

It is expected that increases in productivity are reflected in the software process itself. It is therefore needed to investigate how changes in a software process can be measured. RQ2 is presented to provide insight on this matter:

RQ2 Are there process changes that can be measured which are imposed by Process Automation tool-ing?

To be able to analyze changes in the process of a development team during the introduction of Process Automation tooling, every developer would have to be observed and one would have to record exactly on what kind of task the developer is spending his time and effort. Because this technique does not fit within the schedule and budget of this research project this technique is considered not feasible.

An indirect data source that records state changes of work that is handled by the development team is the Issue Tracker System (ITS). An ITS records for every issue when it was registered, when the state of the issue was changed and when the issue was resolved. In lack of a better data source the data from the ITS is used as a substitute for time and effort data.

The studied literature presented various perspectives on measuring and visualizing process changes with ITS data from the perspective of the duration of issue handling (see section 2.5), the flow of work through the development process (see section 2.3) and the consistency of a development process (see section 2.6). The Goal Question Metric (GQM) paradigm [7, 65] is applied to identify sub-questions and arrive at a meaningful selection of metrics from existing literature:

GOAL: Measure process changes in a software development process which are influenced by Process Automation

QUESTIONS:

RQ2.1 How fast does work progress through the software development process? RQ2.2 How does work flow through the software development process?

(21)

Chapter 3: Case Study Design

Metric / Visualization Reference RQ2.1 RQ2.2 RQ2.3 Lead Time [26, 45, 66] X

Service Time [26, 45] X Waiting Time [26, 45] X Resolution Time [11, 29, 36] X

Value-added Ratio [26, 56] X

Cumulative Flow Diagram [5, 30, 42, 43, 44, 48] X X X Issue Churn View [29, 35] X X X Process Model Discovery [54, 63] X

Process Conformance Analysis [51, 54] X

The duration metrics Lead Time, Service Time, Waiting Time and Resolution Time provide an overview of various process times of the issue resolution process. The Cumulative Flow Diagram is used to get an understanding of how work progresses through the software development process. The Value-added Ratio is selected to provide insight into the relative amount of waste in the process flow. The Issue Churn View is used to monitor and assess the issue handling process over time. The technique Process Model Discovery was selected to provide insight into the flow of work through the process as it is captured in the event logs from ITS data. The Process Conformance Analysis was selected to provide insight into the consistency of the issue handling process.

The selected metrics and visualizations will be evaluated for measuring process changes in the inves-tigated projects during the introduction of Process Automation tooling.

3.1.3

Quantifying Issue Handling Efficiency

The studied literature described a benchmark-based approach [11, 36] for quantifying the efficiency of the issue handling process (see section 2.5). With a benchmark-based approach it is possible to compare issue handling efficiency before and after the introduction of Process Automation and compare efficiency between projects. In this study the usefulness of this approach for measuring the influence of Process Automation on issue handling efficiency is explored. RQ3 is introduced to provide insight in this matter.

RQ3 How can the effects of the introduction of Process Automation on issue handling efficiency be measured?

Issue handling in a software development process roughly consists of two kinds of activities: imple-menting enhancements and resolving defects. These activities have different priorities in a software development process. Therefore, the usefulness for measuring the influence of Process Automation with the approach described in section 2.5 is separately evaluated for both processes:

RQ3.1 How can the effects of the introduction of Process Automation on defect resolution be mea-sured?

RQ3.2 How can the effects of the introduction of Process Automation on enhancement resolution be measured?

To answer these questions, the benchmark data as presented in section 2.5 will be used to rate both defect resolution and enhancement resolution in time intervals.

(22)

Chapter 3: Case Study Design

3.1.4

Measuring Technical Quality

Much work in software engineering consists of performing adjustments to an existing code-base. As a consequence, the quality of this code-base is influencing the productivity of a software development team. Therefore, it is necessary to measure and control the technical quality of the code-base of the product during the introduction of Process Automation tooling:

RQ4 How to measure and control the technical quality of the code-base during the introduction of Process Automation?

Section 2.4 described the SIG Maintainability Model to assess the technical quality of software prod-ucts. In this study, an attempt is made to provide an implementation of the code metrics and aggregation to a maintainability rating for the C# language, similar to the SIG Software Analysis Toolkit as described in [2, 3, 6, 23, 67]. To gain insight in the effect of Process Automation on the quality of the code-base, snapshots of the code-base will be downloaded from the Version Control System (VCS) in time-intervals and analyzed with the SIG Maintainability Model.

Additionally, the relation between maintainability and issue resolution speed will be investigated in the projects studied in this research. Hereby, the studies by Luijten et al. [36] and Bijlsma et al. [11] are replicated.

3.2

Case Selection and Description

3.2.1

Selection Criteria

A number of requirements were used as criteria for the selection of the case and subjects. To allow replication of this study the requirements are stated below:

• The case represents a software company with interest in Process Automation tooling.

• The subjects under investigation are proprietary software projects with an iterative release process (explicitly not waterfall projects) of iterations with a maximum duration of one calender month (as proposed by Schwaber et al. [55]).

• The team size of the software development team is between 3 and 9 developers (the optimal team size for iterative projects as proposed by Schwaber et al. [55]).

• The project makes use of an Issue Tracker System (ITS) where enhancements and defects are registered and administrated.

• The project makes use of a Version Control System (VCS) to enable collaborative source code development among the developers.

• Both the ITS and VCS should have recorded a project history of at least 3 months prior to the research.

3.2.2

Case Description

This study is conducted as an exploratory case study at Ordina Software Development. The com-pany was selected due to its interest in integrating process improvements throughout the Software

(23)

Chapter 3: Case Study Design

Development unit, supported by Application Life-cycle Management tooling from Microsoft. Ordina Software Development will be further referred to as the host organization.

The case study was conducted following the guidelines as proposed by Runeson et al. [53]. Prior to the collection of data a Case Study Protocol was set up, containing design decisions and field procedures which are maintained and updated throughout the study. This Case Study Protocol is included in this thesis in Appendix A to allow for review and replication of this study by other researchers. As frame of reference this study can be considered as an embedded case study of three proprietary software projects that are executed by employees from the host organization. The frame of reference is depicted in Figure 3.1. The projects were selected for this study because the implementation of Process Automation tooling from the Microsoft Application Life-Cycle Management suite within the project’s development processes was planned within the time-frame of this study. The units of analysis are the individual projects where the study subjects are the development teams as a whole. Part of the development team are all individuals involved in product development work.

Project A Project B Project C Context

Case

Unit of analysis Unit of analysis Unit of analysis

Figure 3.1: Frame of reference of the case study

The projects are executed on behalf of clients of the host organization. For confidentiality reasons this study does not report on details of the client organizations or details of the projects that are not related to the purpose of this research. In this study, the individual projects are referred to as Project A, Project B and Project C. The projects are not related to each other and there are no collaborations between the development teams. The main programming language of the three projects is C#. The projects are briefly introduced in the following sub sections.

3.2.3

Project A

Project A comprises the development and maintenance of a software system that was initially devel-oped by another team outside of the host organization. A team of 5 software engineers from the host organization has taken over the maintenance and development work from the previous team. At the time of the study the team from the host organization had 7 months of experience with the project. The development team has grown to 9 developers at the time of the study. The system is already in production for a couple of years and the new team is responsible for maintaining this system by fixing defects and implementing new functionalities. The team uses Scrum for their development process with sprints of 4 weeks.

3.2.4

Project B

Project B comprises the construction of a software system that was (similar to Project A) initially developed by another team and was transferred during initial development to the current team at the host organization. At the time of the study, the team of 9 developers was planning to finish the initial development stage and the project would soon be transferred to a support and maintenance

(24)

Chapter 3: Case Study Design

organization. The project was developed with the Waterfall method, but the team used Scrum for their internal development process.

3.2.5

Project C

Project C comprises the construction of a software system consisting of many subsystems. The software system is in the initial development stage. The team of 7 developers has worked on the project since it was initiated and has a year of experience with the project at the time of the study. The team uses Scrum and delivers new functionality to the client organization every 2 weeks.

3.3

Data Collection and Processing

This study utilizes multiple data sources in order to increase the validity of the results, as recommended by Runeson et al. [53]. These data sources can be divided into qualitative data and quantitative data. On the one hand, the qualitative data sources are the observations of the teams and interviews with the teams. On the other hand, the quantitative data sources are the project history data, Issue Tracker System data and Version Control System data.

The process of how data is collected from these data sources is depicted in Figure 3.2. For each research question the relevant part of this process is further elaborated.

ITS Project A VCS ITS Project B ITS Project C Project History Data Mine Issue Log Unified Issue Log Mine Issue Log Mine Issue Log Pull Snapshots Analyze Snapshots Analyze Issue Log Process Mining metrics & visualizations

Lean Manufacturing metrics & visualizations

Issue Handling visualizations Issue Resolution

Metrics Code Quality Metrics

Team

Interview

Team Team opinion Observe Team Researcher opinion Report Findings Analyze results

Figure 3.2: Overview of research

3.3.1

Process Automation tooling usage

To be able to reason on the influence of Process Automation in the projects under research it is necessary to to get an understanding of how the tooling is used by the team during the time-frame of the study (RQ1).

RQ1 will be answered by collecting and analyzing a combination of both qualitative data and quanti-tative data. During the implementation of the Process Automation tooling the teams will be observed by the researcher, who is also participating in the project as Software Engineering consultant. At least once a week the daily stand-ups of the teams will be attended by the researcher. Since the researcher is participating in the team, the team is expected to have a low awareness of being ob-served. There is a medium degree of interaction expected between the team and the researcher during these observations: The researcher will ask the team about their progress, and the team will ask the

(25)

Chapter 3: Case Study Design

researcher for advice or tell about their experiences with the Process Automation tooling. The data that is being collected during these observations are field notes, which can be seen as third degree data collection technique. The observations of the teams will be verified with data in the ALM database. This database keeps track of all executed automation processes. Collecting data from this database can be seen as a third degree data collection technique because the data was not archived for the purpose of this research.

3.3.2

Measuring Process Changes

RQ2 was presented to provide insight into productivity from the perspective of changes that take place in a software process when Process Automation tooling is introduced. RQ2 will be answered by analyzing data from the Issue Tracker Systems of the investigated projects.

A custom program was written to export issue data and issue history data from the ITS for each of the projects. All issues that are not resolved or closed are removed from the issue log. Since every ITS uses a different data format, the mined issue logs are converted to a unified issue model. The unified model was composed to represent the minimum set of properties required to calculate the selected metrics. In the unified issue model the following data is collected for each state change of an issue:

• Issue Identifier • Time stamp

• State (Waiting, In Progress, Resolved, Deployed ) • Issue Type (Defect, Enhancement )

The unified issue model comprises four meta-states. The states from each ITS are mapped to these meta-states. The Waiting state and Resolved state represent issues in a passive state where no value is added to the product. Issues in an active state (where value is added to the product) are represented with the In Progress state. The Deployed state indicates that the value was delivered to the customer (through a product release). Hereby, the regular process flow from issue registration to delivery in the unified issue model can be written as: Waiting → In Progress → Resolved → Deployed.

The project history data from each project contains the dates of iterations in the projects and is used to enrich the event log with deployment data. For each issue that was resolved within an iteration a Deployed event with the iteration end date as time stamp is added to the unified event log. Hereby it is assumed that the code related to an issue that was marked as Resolved within an iteration is (potentially) deployed at the end of the iteration. This enriched unified issue log allows for the calculation and generation of the selected metrics and visualizations.

Unusual ITS usage could influence the measured results. Therefore, it is necessary to investigate the distributions of the metric values. When an unusual amount of issues is opened or closed in a certain time-frame the data in the issue log should be manually inspected. Examples of unusual ITS usage are bulk-opened issues or bulk-closed issues, which could indicate an import of issues from another system, or a clean-up of issues that were already resolved. In both cases these issues would have to be excluded from the dataset for the duration measurements because the state changes of these issues were not accurately administered.

Duration Metrics

(26)

Chapter 3: Case Study Design

Lead Time The Lead time metric attempts to measure the time between the moment a need for a change rises and the moment the change was delivered [45]. In lack of more accurate data, the time stamp an issue was reported is considered as the moment there was a need for a change. The first upcoming deployment date after the issue was resolved is used as indicator of the moment a change was delivered.

Service Time The Service time metric attempts to measure the time needed to perform the actual work that needs to be done to resolve an issue [45]. The state change of an issue to In Progress is taken as an indicator of the moment the actual work started.

Waiting Time The Waiting time metric is the opposite of the Service time metric and measures the time that an issue is in the process, but no actual work is performed [45].

Resolution Time The Resolution time metric measures the time an issue is in an open state [10]. The time between the moment an issue was resolved and eventually reopened is not taken into account.

Reported Assigned In

Progress Resolved Reopened Resolved

Lead time Service time Waiting time Resolution time

Deployed

Figure 3.3: Relation between duration metrics and issue states

Value-added Ratio The Value-added ratio of the issue handling process is the ratio between the average process time issues are in the In Progress state and the average lead time of issues [26].

For the calculation of the duration metrics the process states from the individual projects are first translated to the process states from the unified model. The metrics are then calculated by iterating over the issue events and (depending on the metric) summing up the time spans between the state transitions. Metrics for defects and enhancements are calculated separately.

Cumulative Flow Diagrams

For the construction of CFDs a custom program was written that uses the unified issue log as data source. The program iterates over all issue events in the unified model and keeps track of the state of all registered issues over time. The program records the weekly number of issues for each meta-state in the measured time-frame. Finally, the collected data is plotted as a stacked area graph.

Issue Churn Views

Similar to the construction of CFDs, a custom program was written that iterates over the issue events in the unified issue log. For each month in the measured time-frame a list of new and old issues was recorded to determine the numbers of new issues (open and closed) and old issues(<6 months open, >6 months open and closed). Finally, the collected data is plotted as a stacked bar chart.

(27)

Chapter 3: Case Study Design

Process Model Discovery

Process Mining tool Disco1is used to perform Process Model Discovery. Disco can generate a process

model of nodes and edges from the actual process as it is recorded in the raw event logs. To get an understanding of the actual process steps, the raw issue tracker data is imported into Disco (rather than the unified issue data). The event log is filtered to contain only events from defects. Disco is used to create a visualization of the discovered process for the defect resolution process with all paths and activities found in the raw issue data.

Process Conformance Analysis

To gain insight into how the process as it is executed changes over time, it is necessary to define a frame of reference to measure conformance against. As frame of reference, the process model as it is perceived by the team is used. The teams model their software process in a session with the researcher. The teams are provided with the process states from the ITS and the team is asked to draw all state transitions between the process states as they perceive them in their defect resolution process. The models are converted by the researcher to petri nets with the tool Yasper2.

For the Conformance Analysis the Process Mining tool PRoM3 is used to replay the raw event log

on a process model and calculate the conformance metrics Fitness, Behavioral Appropriateness and Structural Appropriateness. The level of conformance is then calculated by taking the mean of the three orthogonal conformance metrics, which all yield a value between 0 and 1.

First, a conformance analysis is performed on the whole event log for defects to get an understanding of the general conformance of the event logs regarding the perceived process models in the projects under research. To gain insight into changes in process conformance over time, the deployment dates collected from project management data are used as measurement points. For each time-frame between deployment dates, the conformance of the issue events of the defects that were resolved within that time-frame is measured. Finally, the conformance metrics for each deployment date are visualized in a line chart.

3.3.3

Quantifying Issue Handling Efficiency

To answer RQ3, the Resolution Time metric values are used to rate both the defect resolution speed and the enhancement resolution speed. The metric values are aggregated to a resolution rating according to the threshold values as published in [11]. The approach and corresponding threshold values were described in section 2.5.

A custom program was written to translate the previously calculated resolution time metric values for defects and enhancements to risk categories. Second, a risk profile was constructed for both the defect resolution process and the enhancement resolution process. Each risk profile can then be translated to a quality rating between 0.50 and 5.50 by using the interpolation function as described by Alves et al. [3]. Both risk profiles are plotted as a stacked bar chart.

To investigate the trends in both defect resolution efficiency and enhancement resolution efficiency, the resolution rating is calculated for each time-frame between the deployment dates of the projects. First, a custom program was written that links issues to time-frames according to the final date they were resolved. Second, a risk profile for both defect resolution and enhancement resolution was

1Fluxicon Discohttp://www.fluxicon.com/disco/

2TU/e Yasperhttp://www.yasper.org/

3ProM Process Mining Workbench

(28)

Chapter 3: Case Study Design

constructed for each time-frame. These risk profiles were then translated to a resolution rating. These results have been plotted in a line chart.

3.3.4

Measuring Technical Quality

RQ4 was introduced to measure and control the technical quality of the code-base of a product during the introduction of Process Automation tooling. The technical quality of the code-base is measured by applying the SIG Maintainability Model as developed by the SIG (see section 2.4).

Downloading Snapshots of the Code-base

To investigate how technical quality evolves over time when a phenomenon like Process Automation is introduced, the snapshots of the code-base are analyzed in time intervals. In this study, project iterations are used as time intervals as these iterations represent a start and end moment of work around a certain theme in a project. Prior to performing code analysis, snapshots of the code-bases for all three projects have to be collected.

All three projects use the TFS Version Control4as VCS, which contains an API with functionality to

retrieve a snapshot of the code-base on a certain date. However, each project was found to organize their version control workspace in a different way with respect to the use of branches and sub-products. Calculating code metrics over a snapshot of the entire workspace would not result in a representative rating for the code-base on which the teams worked in a certain time-frame. For example, an analysis of an entire workspace with multiple branches of the same product would result in high percentages of duplicate code. To arrive at a meaningful maintainability rating, only the product branches from the workspace that a team worked on in a certain time-frame are considered.

To select the product branches that a team worked on during a certain time-frame, all commit transactions performed in that time-frame are analyzed. The commit transactions can be queried with the TFS Version Control API. From these transactions, the set of Solutions5that a team worked

on during a certain time-frame is determined and downloaded with the TFS Version Control API.

Calculating Code Metrics

For the calculations of the code metrics for the SIG Maintainability Model a custom program was written that takes a directory as input and returns the metric values and maintainability rating as output. The program calculates code metrics for the code in the directory by using techniques like regular expressions, parsing source code and traversing abstract syntax trees. The Roslyn compiler6

is used to parse source code to abstract syntax trees and analyze program semantics to find type dependencies. The technical implementation of each code metric is briefly described in Appendix B.

To gain insight into trends in the maintainability of the code-base of a project, the metric ratings and maintainability ratings are plotted over time. As a snapshot of the code-base often consists of multiple branches, only the solutions of the branches that contain commit transactions between the previous snapshot date and the current date are analyzed. The maintainability ratings of multiple solutions are weighted by volume to arrive at a single maintainability rating for a snapshot date.

4Microsoft Team Foundation Serverhttp://www.visualstudio.com/en-us/products/tfs-overview-vs.aspx

5In Microsoft Visual Studio, a solution is a container for projects/packages, and tracks dependencies between those

projects/packages.

(29)

Chapter 3: Case Study Design

Relation between Maintainability and Issue Handling Efficiency

A custom program was written to combine the maintainability datasets with on one hand the defect resolution datasets and on the other hand the enhancement resolution datasets. The data points are first explored by plotting all rating values over time in a line chart. Then, the data points are visualized in a scatter plot. Finally, a correlation test is performed.

Both the maintainability rating and the issue resolution ratings have an ordinal scale. Both are ratings derived from data with an asymmetric distribution rather than a normal distribution [11, 23]. Therefore, the Spearman rank-correlation method [57] is chosen to analyze the correlation of the dataset. To maintain statistical independence only data points with a distance of two weeks between consecutive snapshots of a project are considered. A correlation analysis is performed on both the combined dataset as well as the dataset of each individual project. A statistical confidence level of 95% (p ≤ 0.05) is chosen to qualify significant correlations.

3.4

Threats to Validity

This section describes the validity threats for this case study design. Threats are classified using the categories Construct Validity, Internal Validity, External Validity and Reliability, as proposed by Yin [71].

3.4.1

Construct Validity

Measuring changes in productivity This study uses a set of metrics and visualizations that are expected to provide insight into the influence of Process Automation tooling on the productivity of software development teams. Because the output of the software engineering process is hard to quantify, this study presented various quantifiable perspectives that considered as a whole provide insight into changes in aspects related to productivity. The question How productive is a software development team? is hereby substituted with other questions, like: How fast does the team resolve issues? How efficient is the issue handling process? How consistently is the software process carried out? What is the quality of the code-base? The answers to these questions provide perspectives on the productivity of a software development team. Because productivity in software engineering is influenced by many factors, this list of perspectives chosen in this study is not considered complete. However, this study attempts to control two factors that influence productivity, namely the maintainability of the code-base and the maturity of the software process.

Quality of the data ITS data was chosen as data source to measure process changes in the issue resolution process. The software process of the software development team is reflected in the issue log. However, state changes are not always accurately administered. Some issues may have been resolved without being registered at all in the ITS. Issues may have been opened (import from external system) or closed in bulk (clean-up of issues that had already been resolved). To increase the quality of ITS data, issues that were opened or closed in bulk were filtered from the dataset. Although ITS data is not accurately registered, we believe it is still a useful source of data to gain insight in characteristics and trends of a process and an understanding of how work is handled by a software development team.

Measuring process changes This research explored various aspects of measuring process changes: Duration, efficiency, flow and consistency. The metrics to measure these aspects of process changes were selected by using the GQM paradigm [7, 65]. Nevertheless, more process aspects could have been measured. Examples of other process changes are the number of reopened issues over time, defect density, code churn, functional size of enhancements, etc. The set of

(30)

Chapter 3: Case Study Design

measured aspects is not complete. However, the chosen aspects are expected to be influenced by the introduction of Process Automation tooling.

Quantifying issue resolution speed For the quantification of issue resolution speed over time this study replicated one benchmark-based approach that relied on ITS data. A study of existing lit-erature did not provide other approaches for quantifying issue resolution speed. However, other approaches using other sources of data might also be suitable. Thus, the approach presented in this study is no definitive solution.

Measuring trends in technical product quality To measure trends in the technical quality of a software product this study implemented one benchmark-based approach. However, other approaches exist, like for example the SQALE Model [33] and the Maintainability Index [41]. This research did not compare other approaches to measure the technical quality of a software product. Therefore, the chosen approach is by no means a definitive solution.

3.4.2

Internal Validity

Confounding factors The aim of this study is to provide approaches for measuring the influence of Process Automation on the productivity of software development teams. There are many factors in software engineering that influence productivity [12, 31, 68]. As previously mentioned, this study attempts to control the technical quality of the code-base and the maturity of the software process. However, there are many other factors that influence productivity and it is impossible to identify and control all of them. Examples of other factors are: The skills of developers in the team, the granularity of issues, demanded reliability levels, organizational factors, cultural factors, etc.

3.4.3

External Validity

Level of Process Automation The projects in this research are expected to implement Process Automation tooling in their software processes during the time-frame of this study. To validate the measurements and visualizations in this study for the use of measuring the influence of Process Automation on productivity, the projects must implement sufficient levels of Process Automation. The extent to which Process Automation is implemented will influence the gener-alization of the results to projects planning to implement higher levels of Process Automation. Size of sample The proposed metrics and visualizations are tested on a small sample of three pro-prietary projects. Hence, the generalization of the results of this study to other projects should be conservatively approached. The three projects have similar characteristics that are typical for small to medium-scale information systems and use iterative development processes, which are used widely in information systems development. However, software projects in different problem domains have different functional complexities. It is therefore questionable whether software projects with similar code quality and issue handling efficiency would benefit from the same increase in productivity when the same Process Automation tooling is used.

Life-cycle stage The projects investigated in this research are in the initial development stage or evolution stage. It is expected that these projects have different issue handling characteristics than projects in the servicing stage. Therefore, it is expected that the metrics and visualizations presented in this study need adjustments in order to be suitable for measuring the influence of Process Automation on the productivity of software development teams of projects in the servicing or maintenance stage.

Generalization to open-source projects The three projects all involve employees from the same company with different levels of experience. It is likely that the same mix of employee experience can be found in other proprietary software projects. However, open source projects often have

Referenties

GERELATEERDE DOCUMENTEN

120 6.3 Descriptive statistics of implementation classes modeled in sequence diagrams 120 6.4 Correlation between independent variables of class diagram LoD (Spearman’s)122

The research questions outlined below are formulated to look at the impacts of UML modeling on software quality from different perspectives (e.g., from the point of view of

To unveil the effect of UML modeling on the effort spent on fixing defect, we need to perform statistical analysis to compare the difference in defect-fixing effort between the NMD

In other words, on average, subjects who received UML model with high LoD had higher comprehension correctness (mean=0.297, std. error mean=0.172), and this difference was

To assess the unique variance of defect density that is explained by the class diagram LoD measures, we performed a multiple regression analysis in which we used CD aop , CD asc

This chapter reports on an empirical investigation about the usefulness of UML design metrics as predictors of class fault-proneness. We collect empirical data from a

Having witnessed the usefulness of LoD measures as predictor of defect density in the im- plementation, we investigate the feasibility of using UML design metrics such as LoD to

In Proceedings of the 11th International Conference on Model Driven Engineering Languages and Systems (MODELS) (2008), Czarnecki, Ed., vol. Generating tests from