• No results found

Manifestation of Bugs in the Process of Delivering Technical Software Systems

N/A
N/A
Protected

Academic year: 2021

Share "Manifestation of Bugs in the Process of Delivering Technical Software Systems"

Copied!
62
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

Master’s Thesis

Manifestation of Bugs in the Process of

Delivering Technical Software Systems

Author:

Yosuf Haydary

UvA Supervisor: Dr. Magiel Bruntink CGI Supervisor: Drs. Arjen van Schie

A thesis submitted in partial fulfilment of the requirements for the degree of Master of Science in

Software Engineering

Status: Final

(2)

بن ی آ د م ا ع ض ا ی ی ک د ی گ ر ن د ک ه د ر آ ف ر ین ش ز ی ک گ و ه ر ن د -س ع د ی

“Human beings are members of a whole, In creation of one essence and soul.”

(3)

UNIVERSITY OF AMSTERDAM Faculty of Science

Software Engineering

Abstract

Manifestation of Bugs in the Process of Delivering Technical Software Systems

by Yosuf Haydary

This case-study focuses on analyzing and understanding the post-release issues of two industrial systems, which are distributed over different sub-systems and environments. The analysis and understanding can provide guidance to detect similar issues earlier before a system is released.

In this study, a manual inspection on more than 1100 issues is performed to identify the origin and context in which these issues are found. The issues could be respectively put into the following categories: 1. application integration (332 issues), 2. environ-ment integration (124 issues), 3. deployenviron-ment (111 issues), 4. middleware (110), and 5. functional configuration (82) issues. 342 issues could not be put into any category. The remainder of the issues are identified as application configuration, test-tool, system configuration, incidental, and performance related issues.

This study also shows that more than 30% of these industrial systems issues are not software bugs, but rather documentation, requirement or process related issues, which is comparable to open-source software systems.

A quick study of the delivery process of one of the available systems indicate that the lack of a representative environment and the lack of early system integration during development are the possible root causes for late detection of these issues. The presence of a production-like environment and early integration of these systems might help to detect some of the similar issues earlier.

(4)

Acknowledgements

786

I would like to take this opportunity to express my gratitude towards Dr. Magiel Bruntink and Drs. Arjen van Schie for their supervision, useful feedback and guid-ance in fulfilling this research. I am thankful to all my teachers, the assistants and the staff at Software Engineering programme of UvA.

I thank my colleagues and managers at CGI, especially MSc. Gertjan Spierenburg and Ing. Erik Dielen for their time and feedback. I also thank the organization that has provided the research data.

Last but not least, my immense gratitude goes out to my parents, family and friends for their love and support, without whom all this would not have been possible.

Yosuf Haydary 9 July 2014

(5)

Contents

Abstract ii

Acknowledgements iii

1 Introduction 1

2 Motivation & Problem Statement 2

3 Background & Context 6

3.1 Terminology . . . 6

3.2 Related work . . . 6

3.3 Description of Ashqary and Saadi . . . 7

3.4 The process of delivery . . . 9

3.4.1 Pre-release . . . 10

3.4.2 Post-release . . . 11

4 Research Method 12 4.1 Methodology . . . 12

4.2 Design and Validation . . . 13

4.2.1 Understand the issues . . . 13

4.2.2 Look for Trends . . . 14

4.3 Case Study Protocol . . . 14

5 Conducting the Research 15 5.1 Understanding the issues . . . 15

5.1.1 Usability . . . 15

5.1.2 Classification . . . 16

5.1.3 Manifestation (Tagging) . . . 17

5.1.4 Validation . . . 19

5.1.4.1 Resolve the doubtfuls . . . 19

5.1.4.2 Get expert confidence . . . 20

5.1.4.3 Visualize researcher’s accuracy . . . 21

5.2 Looking for trends . . . 21

5.3 Other Observations . . . 22

(6)

6 Results & Analysis 23

6.1 Understanding the issues . . . 23

6.1.1 Usability . . . 24

6.2 Trends . . . 25

6.2.1 Severity and priority . . . 25

6.2.2 Classification . . . 27

6.2.3 Manifestation . . . 28

6.2.3.1 Refining the untagged . . . 29

6.2.4 Analysis effort and duration . . . 30

7 Discussion 35 7.1 Severity and priority . . . 35

7.2 Classification . . . 36

7.3 Manifestation . . . 36

7.4 Analysis effort and duration . . . 37

7.5 The possible root causes . . . 38

7.5.1 Lack of representative environment . . . 38

7.5.2 Late integration . . . 39

7.6 Early detection . . . 39

7.7 The issue with issues . . . 40

8 Threats to Validity 41 9 Conclusion 42 9.1 Future work . . . 43

A The Case Study Protocol 46

(7)

Dedicated to my parents Wahida & Wali.

(8)

Chapter 1

Introduction

The ongoing change and the demand for innovation in the software industry has resulted in more bugs and undesired behavior in the software systems than before, Zeller [1], Sullivan and Chillarege [2] and Sutherland and van den Heuvel [3]. Despite continuous testing and using different testing techniques during development and maintenance, software systems are still delivered with bugs.

Critical bugs not only play a major role in the quality assessment of the systems Young [4], but they can also have a major economical impact on the IT systems, Tassey [5]. So, what can these bugs learn us?

This research focuses on analyzing and understanding post-release bugs of two industrial

technical software systems, Ashqary and Saadi1, in order to suggest improvements to

detect similar bugs earlier at the Technical Software Engineering (TSE) department of CGI.

In the following chapters; the motivation behind this research is given in chapter 2, followed by some background information in chapter 3. The research method, the process of conducting the research and validating the analysis are reported in chapters 4 and 5. The results and the discussion can be found in chapters 6 and 7. The report is finalized by identifying the threats to validity in chapter 8, followed by conclusion in chapter 9.

1

Ashqary and Saadi are the names of Persian poets used as reference to the systems due to confi-dentiality of the data. Ashqary is a poet from Kabul from the 19th century, and Saadi is a poet from Shiraz who lived in the middle ages.

(9)

Chapter 2

Motivation & Problem Statement

Bugs have always been an interesting subject of study and as more data becomes avail-able the research in this area grows Hamill and Goseva-Popstojanova [6] Herzig et al. [7] Nguyen et al. [8] Antoniol et al. [9]. Perhaps the ideal and almost impossible goal is to understand bugs and avoid them Boehm and Basili [10]. However, bug tracking systems and bug reports are still a rich source of information to detect these bugs earlier. Detecting bugs earlier leads not only to cheap and easy fixes, but also early detection of bugs can play a major role in the reliability of systems. The more bugs are detected and fixed earlier, the lower are the chances that a system fails due to a bug, and thus the more stable, dependable and available a system is, Young [4], Zeller [1].

The TSE department of CGI develops and maintains technical software systems. These systems shape the IT infrastructure of the owning organizations. They play a key role in the correct functioning of organizations. These systems are required to be reliable and highly available, because the failure of such systems can have a huge negative impact on organizations.

The process of delivering these systems can be simplified in pre-release and post-release phases. The pre-release phase of delivering a system includes the development and maintaining activities performed at TSE. The post-release phase includes activities in test and production environments of the organization owning the systems.

Despite rigorous testing at different levels during pre-release phase, many issues are found in release phase, which are available. See figure 2.1 for one of the post-release issues.

(10)

Chapter 1. Motivation & Problem Statement 3

Title The application cannot log when the configuration server is unavailable

Id 111 Severity Cosmetic Priority M Status Solvable Reproducible Y Subsystem Subsystem x

Insertion Date <Date>

Closing Date <Date>

Description

One of the following happens when the configuration server is not available:

1. If the application has started before a message will appear: “Error while reading the configuration. See the system log for more details.”

However, there is nothing about this issue in the log file. This happens because the log4j configuration is loaded from the configuration server. Because the configuration server is unavailable nothing will be written to the system log.

2. If the application has never started before, a message will appear that the *.* path cannot be found.

Comments

Analyzer1, <Date>: Is it possible to use a default logging location in case the application has started before (by webstart) but the configuration server is unavailable? This will at least log that the configuration server is not available to load the log4j configuration.

________________________________________

Analyzer2, <Date>: Also registered at our issue tracking system. _______________________________________

Analyzer1, <Date>: Today’s decision: Solve it. In other words, even when the configuration server is not available the application should be able to make a log file with at least the message “configuration server is not available”.

In this issue report, the following fields which are either irrelevant, confidential or never filled in, are dropped: Cause Duplicate of Detection Phase Introduction phase Estimation to solve Owner

Planned version to solve

ID of purposed change Solution time Project Release planning Release found in Subject Reporter

Type of solution or rejection Found in Cycle

Reconsideration date Goal Cycle

Closing version Version found in

Figure 2.1: An example of a simple post-release issue translated from Dutch.

(11)

Chapter 1. Motivation & Problem Statement 4 Q1. “Based on the analysis of industry bug data, what category of bugs can be detected

more efficiently before a software system is released at TSE? ”

1

Discovering bugs earlier based on given bug data requires to dig into the data. It is critical to understand how these bugs are categorized, grouped and so on. To get a better understanding of this situation, the following sub-question needs to be answered:

Q2. “What types of bugs occur the most, categorized by their severity and priority attributes? ”

While classification of bugs based on severity and priority is useful to determine their impact, localizing the bugs where they really originate from is another fundamental aspect.

Q3. “How are the different categories of bugs related to aspects such as integration, configuration, process, documentation, and the different layers like operating system

and middleware? ”

A recent study shows that 33.8% of issues reported as bugs are not bugs, Herzig et al. [7]. This figure is based on data of 5 open-source projects. Although they assume that due to better process arrangement in industry there are less misclassified issues, it is still important to validate this assumption and measure the misclassification. Misclas-sification in industrial data-set is also studied by Nguyen et al. [8], reporting that even in a near-ideal situation the bias is present. So, assuming that the post-release issues of Ashqary and Saadi are reported as bugs, does the following hypothesis hold?

H1.2 “The industrial bug data is less misclassified than the open-source bug data

because of better process arrangement.”

Normally, an issues is first observed and reported. The issue is then analyzed and resolved by solving or rejecting it. It is possible that certain category and types of issues take longer and need more analysis. So:

Q4. “How much analysis effort and resolution time is required for the different categories of bugs? ”

1

Q1 = Question 1.

2

(12)

Chapter 1. Motivation & Problem Statement 5 Finally, the answers to above questions can provide insight in the shortcomings of pro-cesses, technologies and testing techniques used to develop, maintain and deliver these systems at TSE, and propose improvements for early bug detection.

Q5. “Based on the research analysis, what are the visible shortcomings of the processes, testing methodologies and techniques at TSE? ”

As mentioned earlier, the scope of this study is limited to the issues that are found after releasing the two systems. These issues are mainly reported during testing activities in the test environment and do not include the issues found by the development or maintaining teams before releasing these systems. Although, the goal is to improve the process of delivering systems, this study focuses on understanding these bugs. The cost-efficiency of detecting bugs earlier is not in the of scope of this project.

(13)

Chapter 3

Background & Context

This chapter presents some background information on the terminology, relevant litera-ture, and the systems of which the issues in this study originate from.

3.1

Terminology

Many terms in the area of testing and bug detection are quite clear and describe a different context or state of a problem, yet there are also a lot of terms used and misused in this area. Examples are ‘defect’, ‘bug’, ‘issue’, ‘error’, ‘failure’, ‘infection’, ‘fault’, ‘flaw’, ‘anomaly’, ‘incident’, ‘problem’ and many more. This is also reported by Hamill and Goseva-Popstojanova [6].

The terminology in this research is outlined in chapter 5, which is mainly inspired by IEEE Standard Classification for Software Anomalies [11], Wagner [12], Hamill and Goseva-Popstojanova [6], Herzig et al. [7], ISO/IEC/IEEE 24765 Systems and software engineering [13], and Zeller [1].

3.2

Related work

Pfleeger and Hatton [14] studied a set of issues. While their goal was to investigate the influence of formal methods, the goal of this study is to detect issues earlier. They do provide recommendations and a basic methodology to execute such studies like executing the analysis of data in phases, which is used in this study.

Herzig et al. [7] studied the misclassification of issues, and how this misclassification can impact bug prediction systems. Their study provides a good overview for classifying the

(14)

Chapter 3. Background & Context 7 issues which has been very useful in designing this study. They have used issues from 5 open-source projects. This study uses the issues of 2 industrial systems.

Hamill and Goseva-Popstojanova [6] have studied the localization of faults that lead to individual software failures and the distribution of different types of software faults. There is an overlap between the different categories used in their study and this study. They have put requirement related issues and data related issues at the same classifi-cation level. In this study there are two classificlassifi-cation levels. First, the classificlassifi-cation from the perspective whether it is a bug or not bug (like requirement or documentation issues). Second, the localization of the issues in the software and the context in which they are found such as data problems and application integration issues.

Mogul [15] has studied the emergent behavior of complex software systems, which in some ways has an overlap with the issues of complex systems used in this study. They also provide a list of examples to understand the emergent behavior better.

Also, other studies have been done in the same area like Nakashima et al. [16] who studied the bugs of a project to improve the software design process.

3.3

Description of Ashqary and Saadi

The issues in this case-study come from two industrial systems, Ashqary (728 issues) and Saadi (416 issues). Both of these systems are mission critical and must comply to high availability and reliability requirements. These systems are part of the organization’s IT infrastructure renewal towards a new architectural vision.

Saadi is in production and Ashqary is still in testing phase. This means that the data of Ashqary mainly comes from testing and acceptance environment while some of the issues of Saadi also come from production environments.

The communication between the different components of these systems are asynchronous.

Ashqary consists of around 112.000 C++ SLOC1. Saadi exists of around 135.000 Java

SLOC. These two numbers only indicate the size of the subsystems in Linux environ-ments as depicted in figure 3.1 and 3.2. The size of the projects as a whole including legacy environments is not available.

1

(15)

Chapter 3. Background & Context 8

Linux OS Multiple Linux Nodes

Legacy OS

OS boundary

Legacy environment

Linux enviroment

Multiple Legacy Nodes

Component 1

Component 3 Component 2

BUS

System component

Legend: element mentioned only if not self-explanatory in the diagram

Redundancy Two-way data communication Legacy BUS2 Legacy BUS3 Legacy environment Bridge Bridge Bridge Legacy OS Multiple Legacy Nodes

Component 7

Legacy BUS1

Message Oriented Middleware Component 5

Component 6

Message Oriented Middleware Component 4

(16)

Chapter 3. Background & Context 9 Li n u x W o rk st at io n M u lt ip le L in u x N o d es Le ga cy O S U se r Sy st em b o u n d ar y R ic h C lie n t X S er ve r Le ga cy e n vi ro n m en t Li n u x b ac ke n d e n vi ro m en t A p p lic at io n S er ve r, M id d le w ar e B ac ke n d s er ve r co m p o n en t Li n u x fr o n t-en d e n vi ro n m en t M u lt ip le L eg ac y N o d es B ac ke n d c o m p o n en t D B (s ta te ) X -W in d o w s B U S 2 B U S 1 Sy st em c o m p o n en t O n e-w ay d at a co m m u n ic at io n Le ge n d : el em en t m en ti o n ed o n ly if n o t se lf -e xp la n at o ry in t h e d ia gr am R ed u n d an cy C o n fi gu ra ti o n s er ve r Tw o -w ay d at a co m m u n ic at io n Le ga cy B U S

B

ri

d

ge

D ep en d en cy

Figure 3.2: Simplified deployment view of Saadi

3.4

The process of delivery

The findings in this section only pertain to Ashqary and its components that belong to the Linux environment, see figure 3.1. Ashqary is being developed and maintained at TSE. The information gathered in this section is mainly provided by the system architect

(17)

Chapter 3. Background & Context 10 of the development team, the informal contact with the team members, and partly by reading documentation.

3.4.1 Pre-release

Ashqary is a mission critical system. It follows a waterfall-like process. The system requirements are determined and documented by the specification team and then handed over to the development team. The development team is responsible for the right design and implementation.

The process is then refined in an iterative way. The development team iterates every two weeks. The process is very well documented and every step is clear. Also, one of the team members is watching over the process to be followed as defined. The change requests and issues for the iteration are estimated in time and complexity, and split in tasks. Also, because Ashqary has security related aspects, the tasks are marked down separately if they touch the security aspect.

At the end of each iteration, the potentially shippable set of installation packages, which consists of different subsystems and configurations, are delivered to the customer. The new shipment is then ready to be tested.

The subsystems are developed according to recommended coding standards. The project on code level is measured on different aspects like modularity, complexity, source lines

of code, testability and so on. Also QAC++2 is used to detect potential bugs.

The environment which is used by the development team for testing only consists of the Linux part of the environment. The system as a whole is not present. Only a simulator for component 7 which is depicted in figure 3.1 is present. The number of nodes and physical machines are representative enough to perform the required tests. However, the operating systems and the related configuration is not representative. This is because the development team only gets the operating system specifications like the versions but not the exact images that are crafted for the test and production environments.

The data used for testing during development can be separated into functional configura-tion and messages that are exchanged between the different subsystems. The funcconfigura-tional

2

‘Leveraging our core capabilities, QAC++ is the most sophisticated static analysis solution for advanced C++ environments, combining language compliance (up to the latest C++11 standard) with advanced language and dataflow analysis. With compliance packages for MISRA C++, HIC++ and JSF AV C++ coding standards, QAC++ offers an automated, highly effective means of analyzing your code against your chosen coding standard, with metrics and code structure visualizations bringing a further level of clarity to complex C++ projects.” http://www.programmingresearch.com/products/qacpp/

(18)

Chapter 3. Background & Context 11 configuration is representative enough. However, the messages that are used for test-ing are not representative because the team has produced these messages accordtest-ing to specifications, which are not real production data.

Automated tests are present at different levels like unit, module and component. It is however very hard to produce and maintain tests at component level because the test documentation is either missing or incomplete.

Ashqary has complex requirements, configurations, interfaces, and representation of data, which has made the system as a whole also complex.

3.4.2 Post-release

The test environment is owned by the customer where the system testers operate. It is located in a different city. The test team is separate and not part of the development team. The test environment looks very much like the production environment on which the system is supposed to run.

Testing starts by installing, configuring and running the system after one or more com-ponents are delivered/released. The issues which are found are reported and registered

in the system-wide issue tracking system3.

The new shipment also contains rigorous documentation which is evaluated and used to track the new features and fixed issues.

Incidentally, a team member might visit the test environment to reproduce, analyze and fix issues.

3

(19)

Chapter 4

Research Method

This chapter describes the methodology, design, and validation steps to carry out the research.

4.1

Methodology

Runeson and H¨ost [17] define case studies as “investigating contemporary phenomena in

their contexts”. The context of this research, to study a set of available data and explore its characteristics, fits well to be conducted as a case-study. Research methodologies like formal experiment and survey do not apply to this project. First, because this project does not involve designing any controlled experiment. Second, it lacks a generalizable

set of data, Kitchenham et al. [18], Runeson and H¨ost [17].

This project should be seen as an exploratory case study. According to Shull et al. [19] “exploratory case studies are used as initial investigations of some phenomena to derive new hypotheses and build theories”, which matches the goal of the data analysis in this study to provide a basis for further investigation and improvement.

In the field of Software Engineering, analysis of data from databases is an “independent technique”, Shull et al. [19]. The advantage of this technique is that “A large amount of data is often readily available. The data is stable and is not influenced by the presence of researchers”. A disadvantage of this technique is the lack of quality and quantity control, for example missing information in description fields.

(20)

Chapter 4. Research Method 13

4.2

Design and Validation

Some of the required information like duration, severity and priority of issues are already available in the given set of data. Also, the context and background information about the process of delivering the systems at TSE is already available. However, the attributes like classification and origin of the issues are not present and need manual inspection to reveal them.

The activities as part of the research are divided into two phases. Namely, 1. under-stand the issues, and 2. look for trends. Dividing the study into phases is inspired and recommended by Pfleeger and Hatton [14] who analyzed bug reports to study the influence of formal methods.

4.2.1 Understand the issues

Understanding the issues involves manual inspection of each individual issue and putting it in one or more categories and classes. It includes looking at the issue from aspects like its usefulness, its relation to integration and configuration, classification and other characteristics which are required to answer the research questions. An initial set of inspection criteria should be kept in a case study protocol (explained later), and updated as required. Eventually, this phase should provide a set of individual inspected issues upon which different qualitative analysis can be performed in the next step.

This phase is the core of the explorative aspect of the case-study. During exploration of the issues the researcher should try to remain unbiased (Shull et al. [19]) to reveal the different relevant characteristics of issues.

However, this step directly introduces a major validation risk, which is the inevitable human bias. To mitigate this risk and validate the results, the following steps are designed:

1. Consult an expert1: During the inspection the issues, doubtful data should be

marked and consulted on with an expert.

2. Get expert confidence: After the issues are inspected, some of the issues should be randomly checked and discussed with an expert. Further steps may only be taken if the expert is also confident about the results.

1

An expert in this context is a system architect or any other system expert of Ashqary or Saadi who can provide more information in order to understand the issues better.

(21)

Chapter 4. Research Method 14 3. Visualize researcher’s accuracy: Re-inspect at least about 10% of the issues. Mea-suring the deviation between the initial inspection and the re-inspection should visualize the researcher’s accuracy during inspection. Re-inspecting less than 10% of the issues might not be enough to see the accuracy.

4.2.2 Look for Trends

While the previous step provides an understanding of each individual issue and reveal the different categories, this step takes the set of issues as a whole and tries to find trends among the issues by performing qualitative analysis.

Trends are the general tendency of issues and their properties towards integration, con-figuration and other characteristics. For example the high number of integration issues compared to the low number of configuration issues indicate that issues tend to be more integration related than configuration related. Also, the properties of these categories like severity and resolution time can tend towards a direction and show a trend. Another aspect to see is whether these trends are applicable to both projects.

Based on the set of individual inspected issues and their trends, this phase should provide the answers to our research questions. First, which categories of bugs occur the most based on their severity and priority. Second, how are the bugs organized related to aspects such as integration and configuration. Third, assuming that these issues are reported as bugs, validate the assumption by Herzig et al. [7] whether the issues are really bugs or not. Also, this analysis and trends should provide an answer to the required resolution time analysis effort for each category of issues.

Finally, the answer to these questions should provide a lead to propose improvements for the pre-release process of delivery at TSE.

4.3

Case Study Protocol

A case study protocol, which is a recommended tool by Runeson and H¨ost [17] while

con-ducting case studies, should be kept. A “case study protocol is a continuously changed document that is updated when the plans for the case study are changed”. “The case study protocol is a container for the design decisions on the case study as well as field procedures for its carrying through”, [17]. See appendix A for the case study protocol.

(22)

Chapter 5

Conducting the Research

This chapter reports on the execution and the process of conducting the case study. It mainly focuses on the manual inspection and validation of the issues.

5.1

Understanding the issues

The issues of both systems, Ashqary and Saadi, are registered in HPQC1. For the purpose

of this study the issues are exported to the Excel format, which contains the system-wide post-release issues. All the records are written and maintained in Dutch.

The process of inspecting each individual issue consists of three steps: usability, classi-fication, and tagging, which is also depicted in figure 5.1.

Usability

Is the issue usable? Duplicate, corrupt data

Classification

Which class does the issue belong to? Bug, requirement, documentation,...

Tagging

Where does the issue manifest iteslf? Integration, configuration, ...

The process of manual inspection of individual issues

Figure 5.1: Inspection process

5.1.1 Usability

First, each issue is evaluated whether it is usable for further analysis, or that should be excluded. Table 5.1 describes the criteria for excluding the unusable issues.

1HP Quality Center is a commercial issue tracking system.

(23)

Chapter 5. Conducting the Research 16

Unusable is an issue that is not usable for this study because too

much data is missing or corrupted. At least the information in description and title fields should be present. Also, issues that are out of scope of these systems are unusable.

Duplicate is an issue that is already reported. It is clearly mentioned

in summary or comment of the issue record that the issue is already reported.

Table 5.1: Usability criteria

The duplicate issues are kept separate because it is interesting to know how often the same issue is reported. In some cases an issue which reports a different failure of the system is marked duplicate because it is caused by an error reported earlier. It means that there is a common cause which has resulted to two different failures.

5.1.2 Classification

Next, the issue is classified into bug, requirement, documentation, process, or wrong test. The classification is based on the criteria given in table 5.2. The assumption is that all reported issues are bugs. By reading each issue, it is then determined whether an issue can be classified in any of the other classes. An issue is classified at least under one tag.

Bug is an issue which “reports documenting corrective

mainte-nance tasks that require semantic changes to source code”, Herzig et al. [7].

Requirement is an issue which results in an enhancement or a feature

request.

Documentation is an issue which is resolved by updating a document.

Process is an issue which is reported due to a process problem. An

example is a missing configuration file because the person who is installing has not followed the installation manual or has forgotten some installation steps.

Wrong Test is an issue which is the result of misunderstanding the specs,

and thus testing a wrong spec. The result of such a test is unexpected according to the tester, but it is according to the specs.

(24)

Chapter 5. Conducting the Research 17 Although the classification taxonomy is inspired by Herzig et al. [7], there is a subtle difference in the classification used here. In their study an issue is put just in one of the 11 categories like bug, feature request, perfective maintenance, documentation and other. During this research it is observed that while some issues report a bug, they also instills one or more feature requests. So, these issues are put in both classes.

There are also a few other reasons that can explain this difference. First, the majority of issues in this study are reported by testers only, while the issues reported in the study by Herzig et al. [7] could be reported by different groups. Second, this case study does not depend on source code compared to the study by Herzig et al. [7]. Also, a bug, which is an incorrect implementation of the specification, differentiates itself from a requirement regardless of the fact whether the requirement is a new feature request, request for enhancement or an adaptive or perfective maintenance issue. The goal is to see how much percent of the reported issues are really bugs, which makes refinement of requirements less important.

5.1.3 Manifestation (Tagging)

Finally, an issue is tagged. Tagging is the process of finding the origin, and the necessary context in which an issue is found and reported. The criteria used for tagging are reported in table 5.3. This step facilitates a basis by providing different categories and contexts to which issues can be related. Each of these categories can then be refined to visualize the severity and priority of these issues. An issue can have zero or more tags.

(25)

Chapter 5. Conducting the Research 18

Application integration is an issue which is caused by integrating two

(sub)components or (sub)systems of a system together. Moreover, such an issue is detected at the presence of the system components as a whole and might not have been found otherwise. Protocol and interface misspecifications, wrong usage/implementation, data flow problems are application integration issues.

Environment integration is an issue which is a result of integrating a (sub)system into

its runtime environment. The issue might be either in the (sub)system or the environment, but the presence of both of them is necessary to reveal the issue.

Middleware is an issue related to middleware like Glassfish, Weblogic,

EMS, OpenSplice and so on. In cases where a middleware hosts an application, this issue is a more specific environ-ment integration. In other cases it might be a more specific type of application integration issue like EMS that connects different applications together.

Deployment is an issue which manifests itself in installation procedures,

techniques, and human interaction required to deploy a (sub)system.

Functional Configuration is an issue relating to functional configuration. It is a set of

one or more files which shapes the domain specific structure of an application upon the domain infrastructure. The func-tional configuration for an application is mostly provided by a different party. Example: The configuration files which are used by an airport monitoring application to determine the number of runways, their locations and so on.

Application Configuration is an issue manifesting itself in the technical properties or

configuration files of an application. Examples are the time-out, trap destination ip addresses and other configurable application specific settings.

System Configuration is an issue which manifests itself in operating system level

like user rights.

Test Tooling is an issue which is the result of the test tooling limitation

or misbehavior.

Performance is an issue reporting a performance problem.

Incidental is an issue which is seen just once. It is resolved either

because it is mistakenly reported, or it is not reproducible anymore.

(26)

Chapter 5. Conducting the Research 19 An initial set of tags are defined at the start of the project, see appendix A. These tags are extended, refined, and changed as inspection of the issues progressed. A fixed set of predefined tags would not have been possible because as the study continues, it is only then explored how and where the issues really manifest themselves.

One of the examples of refining the tagging is the integration tag. A quick summation during tagging revealed that many of the issues had a relation with integration. Since integration was covering integrating a system to its environment and component-to-component integration issues, the integration tag was refined to environment integration and system integration.

Tagging is the most challenging activity in the process of inspecting the issues. First, because of the technical complexity of the system as a whole. Both Ashqary and Saadi are distributed over multiple physical nodes, and they use different technologies, concepts and configurations. Second, the dependability and availability of these systems makes them more complex. Third, the lack and the complexity of the domain knowledge which is required to picture the exact context in which an issue is observed and reported. The complexity challenge can also be observed during the analysis of the issues by different parties who are involved in developing or maintaining the systems. These parties use the comment field of the issues as a communication medium to write their analysis with the goal to finally resolve the issue.

5.1.4 Validation

As visualized in figure 5.2 the following three steps are taken to validate the manual inspection of the issues.

5.1.4.1 Resolve the doubtfuls

During the manual inspection of the issues there were 34 issues of Ashqary and 3 issues of Saadi which were marked as doubtful, see 5.4. Either there was doubt about the classification or doubt about the tagging of the issues. These issues were consulted with the system experts which eventually lead to classifying and tagging the issues. The main reason for less doubtful issues in Saadi is the familiarity of the researcher with Saadi.

Ashqary Saadi

Total issues 728 416

Issues marked as doubtful 34 3

(27)

Chapter 5. Conducting the Research 20

Resolve the Doubtfuls

Mark the doubtful issues. Consult the system expert to resolve them

Get Expert Confidence

Let the system expert analyse some random issues. Does the expert have confidence in the rest?

Visualize Researcher’s Accuracy

Re-inspect about 10% of the issues.

Measure the deviation between the initial inspection and re-inspection

Th e p ro ce ss o f v a lid a tin g m a n u a l in sp ec tio n

Figure 5.2: The validation process of inspecting the issues manually

5.1.4.2 Get expert confidence

To get expert confidence about the result of the inspected issues two system architects of Ashqary and Saadi were consulted. The selection process of issues was left to the architects to avoid researcher’s bias. The experts randomly selected some issues and evaluated them. This selection process was left to the system expert in order to use the system expert’s knowledge and experience as well.

The expert would then reclassify or change the tagging partially or fully. If all the tags and classes of an issue were changed, it is noted as a full change, otherwise as partial change. In some cases it was clear that the issues were either complex or doubtful which did not always result in an explicit outcome.

After the evaluation sessions, the experts were confident about the evaluation of all of the issues for their usefulness for the rest of this research. Table 5.5 summarizes the expert evaluation.

The results of the analysis were also presented to a group of 25 people who are involved in developing and delivering of Ashqary. The results were recognizable by the audience.

Ashqary Saadi

Total issues 728 416

Total issues evaluated by expert 16 16

Expert fully reclassified or tagged 1 1

Expert partially reclassified or tagged 1 6

(28)

Chapter 5. Conducting the Research 21

5.1.4.3 Visualize researcher’s accuracy

As a last step towards validating the manual inspection, some of the issues were re-inspected. The re-inspection occurred about two weeks after the last activity of initial inspection. These issues were randomly selected from the raw data which did not contain any inspection information. Table 5.6 summarizes the results.

Ashqary Saadi

Total issues 728 416

Re-inspected issues 61 40

Fully different class 8 5

Partially different class 7 2

Fully different tagging 8 3

Partially different tagging 15 13

Table 5.6: An overview table of the researcher’s accuracy

The most differences can be seen in tagging. 15 out of 61 re-inspected issues of Ashqary were partially classified differently than the initial inspection. Also, in Saadi 13 out of 40 issues were partially different. This deviation can be explained by the complexity of the systems, issues and the domain. Another explanation is the fact that in some cases a general tag might be applied instead of the specific tag. For example it might be the case that an issue is first tagged with environment integration, while during the re-inspection the middleware tag which is a more specific environment issue is used.

5.2

Looking for trends

This phase of the research involved performing qualitative analysis on the inspected issues. The results are presented as tables, charts and graphs in chapter 6. The main reason for this phase is to look for the general tendency (trends) of issues and their properties towards different categories such as integration, configuration and other char-acteristics like resolution time which are used in the previous phase.

In this phase the classification and the categorization of the issues are visualized. At the same time duration, impact and severity of issues are presented per category. Also, the number of comments per issue is used as an indication to determine how much analysis effort is required until an issue is resolved.

After visualizing the origins and contexts of the issues based on tags and finding out that about 1/4 of the issues were untagged, the untagged issues were refined on bug classification.

(29)

Chapter 5. Conducting the Research 22

5.3

Other Observations

The following observations were made as well during inspecting the issues:

• Classifying, and tagging would have not been possible at all without reading the comments. While the issue description just reports what is observed, the comments contain the full analysis of the issue and its origin in most of the cases.

• Understanding the core of the reported issue is like finding the needle in the haystack sometimes. Different reporters use different reporting styles. The descrip-tion sometimes is provided with too many log lines which makes it even impossible to read and follow the issue.

• HPQC and perhaps many other issue tracking systems focus on recording the “management” meta data of issues. The core of an issue and the analysis which is the most important aspect in localizing the issue should be traced in the description and comment fields, while the other 30+ fields are not very interesting from this point of view.

(30)

Chapter 6

Results & Analysis

This chapter presents the results obtained by conducting the research as described in chapter 5. While the findings about understanding the individual issues are reported in 6.1, this chapter mainly focuses on the trends and results obtained by looking into the issues as a whole presented in 6.2.

6.1

Understanding the issues

The manual inspection of individual issues involved reading and evaluating of 1144 records. It finally resulted into: 1. evaluating whether an issue is usable for further analysis, 2. what classes does an issue belong to, and 3. what tags can be applied to an issue.

Each issue record consisted of more than 30 fields. The fields that were used during the conducting of this research are shown in table 6.1. The fields that were not used are either irrelevant, unusable or confidential. Figure 2.1 in chapter 2 shows an issue example.

(31)

Chapter 6. Results & Analysis 24

Field name Remarks

Title Always filled in

Description Always filled in

Comments Sometimes empty

Id Always filled in

Severity Always filled in

Priority Always filled in

Status Always filled in

Subsystem Always filled in

Insertion Date Always filled in

Closing Date Empty if issue not resolved

Table 6.1: The issue field that are used in conducting this research

Totally, the issues of both Ashqary and Saadi consisted of around 347082 words in the issues’ description and comment fields. Roughly, each issue had around 300 words on average in the description and comment fields, see table 6.2.

Ashqary Saadi

Total issues 728 416

Total words in description & comment fields of all issues 216927 130155

Total word count mean/issue 298,0 309,2

Comment word count mean/issue 200,6 184,0

Comment count mean/issue 4,7 4,1

Max comment count 37 20

Min comment count 0 0

Table 6.2: Some general facts about issues

6.1.1 Usability

Out of a total of 1144 issues, 24 issues were excluded from further analysis and marked as unusable. Table 6.3 summarizes the duplicates and unusable issues.

Ashqary Saadi

Total 728 416

Duplicate 7 16

Unusable 1 0

(32)

Chapter 6. Results & Analysis 25

6.2

Trends

In the previous section, understanding the issues, the results of inspecting individual issues are presented. This section presents the qualitative analysis performed on the inspected set of issues. These analysis show the general tendency (trends) of issues and their properties towards different categories such as integration, configuration and other characteristics like resolution time.

6.2.1 Severity and priority

Q2. “What types of bugs occur the most, categorized by their severity and priority attributes? ”

Figures 6.1 and 6.2 present the percentages of the issues that are grouped by these two attributes. Cosmetic 7% Troubles ome 41% Severe 38% Blocking 14% Cosmetic 11% Troubles ome 49% Severe 31% Blocking 9%

Figure 6.1: The severity distribution in 720 issues of Ashqary (left) and 400 issues of Saadi (right)

(33)

Chapter 6. Results & Analysis 26 Low 11% Medium 44% High 38% Top 7% Low 15% Medium 39% High 41% Top 5%

Figure 6.2: The priority distribution in 720 issues of Ashqary (left) and 420 issues of Saadi (right)

While figures 6.1 and 6.2 depict the severity and priority of all issues in general, ta-bles 6.4 and 6.5 present the severity of the issues from a classification and issue origin (manifestation) point of view.

Cosmetic Troublesome Severe Blocking

Bug 24 209 218 87 Requirement 14 40 20 7 Documentation 8 48 19 7 Process 1 13 3 2 Wrong test 7 17 23 3 Untagged 26 103 72 30 Application integration 9 63 96 34 Environment integration 0 25 38 21 Deployment 4 37 24 6 Middleware 1 15 33 13 Functional configuration 5 31 18 6 Application configuration 5 20 10 3 Incidental 0 8 10 2 Performance 0 2 4 2 System configuration 1 12 5 5 Test tool 1 23 9 2 Hardware 0 1 0 0

(34)

Chapter 6. Results & Analysis 27

Cosmetic Troublesome Severe Blocking

Bug 25 112 82 30 Requirement 11 48 25 4 Documentation 3 21 6 0 Process 1 15 8 4 Wrong test 4 8 3 0 Untagged 21 62 23 8 Application integration 15 54 39 21 Environment integration 1 12 15 2 Deployment 0 21 16 1 Middleware 0 26 16 4 Functional configuration 2 15 5 0 Application configuration 1 5 8 0 Incidental 1 11 4 2 Performance 1 5 12 3 System configuration 0 5 1 0 Test tool 0 0 0 0 Hardware 0 0 0 1

Table 6.5: Overview of issues of Saadi categorized by severity, class and tag

6.2.2 Classification

H1. “The industrial bug data is less misclassified than the open-source bug data because of better process arrangement.”

Figures 6.3 and 6.4 depict the classification of issues of Saadi and Ashqary. The clas-sification shows that 70% of issues of Ashqary and 61% of issues of Saadi are bugs. Assuming that all of the post-release issues are reported as bugs, it is obvious that a great number of bugs are actually not bugs.

(35)

Chapter 6. Results & Analysis 28 Bug 70% Requirement 11% Documentation 11% Wrong test 6% Process 2%

Figure 6.3: Classification of 720 issues of Ashqary

Bug 61% Requirement 21% Documentation 7% Wrong test 4% Process 7%

Figure 6.4: Classification of 400 issues of Saadi

6.2.3 Manifestation

Q3. “How are the different categories of bugs related to aspects such as integration, configuration, process, documentation, and the different layers like operating system

and middleware? ”

Q3 tries to find out the origin and the context in which the issues are found and reported. These origins and contexts are defined as tags, reported under 5.1.3. As summarized in figure 6.5 and 6.6, about 228 issues of Ashqary and 114 issues of Saadi are untagged. It means that these issues could not be tagged as any of the given categories.

In both Ashqary and Saadi, issues tagged as application integration are the major cat-egory of tagged issues (202 in Ashqary and 130 in Saadi) excluding the untagged (see 6.2.3.1) issues . Also environment integration, deployment and middleware issues occur often.

(36)

Chapter 6. Results & Analysis 29 Functional configuration issues remain one of the biggest configuration related categories in the rest of the issues.

228 202 84 71 62 60 38 35 23 20 8 1 0 50 100 150 200 250

Figure 6.5: Origin and manifestation of 720 issus in Ashqary according to the defined tags 114 130 48 40 33 22 21 18 14 6 1 0 0 20 40 60 80 100 120 140

Figure 6.6: Origin and manifestation of 400 issues in Saadi according to the defined tags

6.2.3.1 Refining the untagged

Taking a futher look into the untagged category because of their high number of blocking and severe issues (see tables 6.4 and 6.5) did not reveal possible new tagging criteria to refine these issues (table 5.3).

(37)

Chapter 6. Results & Analysis 30 Based on their classification, refining the untagged issues reveals that only about half of the issues (51% and 54%, see table 6.6) are bugs, while in the project-wide issues the bug percentages are 61% and 70%, see figure 6.3 and 6.4.

Ashqary Saadi Total untagged 228 114 Bug 51% 54% Documentation 24% 11% Wrong test 16% 7% Requirement 15% 25% Process 1% 8%

Table 6.6: Classification of untagged issues

6.2.4 Analysis effort and duration

Q4. “How much analysis effort and resolution time is required for the different categories of bugs? ”

In the studied issues the comment field is mainly used for analyzing the issue and as a communication and tracing tool between different parties. This field turned out very useful during the research to understand issues. The number of comments in this field can be used as an indication to measure how much analysis is required to resolve an issue. Figures 6.7 and 6.8 show the average number of comments per tag. In these graphs the frequency of each tagged issue is depicted in bold. Also, the average comments of all resolved issues (4,9 in Ashqary and 4,3 in Saadi) are depicted as a horizontal line in these graphs.

(38)

Chapter 6. Results & Analysis 31 7,8 6,2 5,5 5,5 4,8 4,8 4,2 4,2 4,1 3,2 1,0 6 158 17 64 34 52 180 63 18 17 44 0,0 4,9 9,8

Average comments per tag

Figure 6.7: The average comments per tag against the project wide average (4,9) of resolved Ashqary issues. The frequency per tag is printed in bold.

7,2 6,5 5,0 5,0 4,9 3,7 3,6 3,5 3,2 1,8 18 14 26 46 98 94 39 16 21 4 0,0 4,3 8,6

Average comments per tag

Figure 6.8: The average comments per tag against the project wide average (4,3) of resolved Saadi issues. The frequency per tag is printed in bold.

For the purpose of getting the resolution time of issues, only closed issues are used. Closed issues are either solved or rejected issues, which in both of these cases the issues are analyzed enough to make a decision about it. The issues that are still open are excluded because they are undecided. An issue’s duration is the number of days between its insertion time and closing time. In Ashqary 562 out of 728 were closed and in Saadi 349 out of 416 were closed.

(39)

Chapter 6. Results & Analysis 32 Figures 6.9 and 6.10 depict the average resolution time per tag in days. It is pictured against the average resolution time of all issues.

237 173 141 135 131 121 115 104 107 93 93 6 17 158 44 64 34 52 18 180 17 63 0 121 242

Average duration in days per tag

Figure 6.9: The average duration per tag against the average of all resolved Ashqary issues. The frequency of tags is printed in bold.

232 180 147 131 125 100 96 78 86 50 49 14 26 98 18 94 39 46 21 16 1 4 0 125 250

Average duration in days per tag

Figure 6.10: The average duration per tag against the average of all resolved Saadi issues. The frequency of tags is printed in bold.

(40)

Chapter 6. Results & Analysis 33 0 5 10 15 0 100 200 300 400 500 600 700 800 900 Sa m e s ol ut ion ti m e fr e que nc y

Solution time in days

Figure 6.11: Resolution time vs frequency of 562 issues of Ashqary.

0 5 10 15 0 100 200 300 400 500 600 700 800 Sa m e s ol ut ion ti m e fr e qu e nc y

Solution time in days

Figure 6.12: Resolution time vs frequency of 349 issues of Saadi.

Below are some graphs and analysis presenting insertion and closure of issues over time. An important parameter is missing in 6.13 and 6.14, which is the test capacity during finding these issues. This parameter is not registered during testing. The growth of issues based on their severity are also presented in these figures.

0 100 200 300 400 500 600 700 800 M ay -1 1 Ju n e-11 Ju ly -1 1 A ug u st-1 1 Se p te mb er -1 1 O cto be r-1 1 N ov em b er -1 1 D ec em be r-1 1 Ja nu ar y-12 Fe b ru ar y-1 2 M ar ch -1 2 A pr il-12 M ay -1 2 Ju n e-12 Ju ly -1 2 A ug u st-1 2 Se p te m b er -1 2 O cto be r-1 2 N ov em b er -1 2 D ec emb e r-1 2 Ja nu ar y-13 Fe b ru ar y-1 3 M ar ch -1 3 A pr il-13 M ay -1 3 Ju n e-13 Ju ly -1 3 A ug u st-1 3 Se p te m b er -1 3 O cto be r-1 3 N ov emb er -1 3 D ec emb e r-1 3 Ja nu ar y-14 Fe b ru ar y-1 4 M ar ch -1 4 Is sue s Serious issues Troublesome issues All issues Closed issues

Figure 6.13: Growth of Ashqary issues over time (cumulative count per month). Serious represents the severe and blocking issues, and Troublesome represents cosmetic

and troublesome issues.

In figure 6.13 between June 2012 and August 2012 less issues are reported which can be explained by summer vacation. The periods between September 2013 and November

(41)

Chapter 6. Results & Analysis 34 2013, and February 2014 until the end shows an increase in issues which can be explained by preparation for a pilot, and the system release to production.

0 100 200 300 400 500 Ju ly -1 1 A ug u st-1 1 Se p te m b er -1 1 O cto be r-1 1 N ov em b er -1 1 D ec emb e r-1 1 Ja nu ar y-12 Fe b ru ar y-1 2 M ar ch -1 2 A pr il-12 M ay -1 2 Ju n e-12 Ju ly -1 2 A ug u st-1 2 Se p te m b er -1 2 O cto be r-1 2 N ov emb er -1 2 D ec emb e r-1 2 Ja nu ar y-13 Fe b ru ar y-1 3 M ar ch -1 3 A pr il-13 M ay -1 3 Ju n e-13 Ju ly -1 3 A ug u st-1 3 Se p te mb er -1 3 O cto be r-1 3 N ov emb er -1 3 D ec emb e r-1 3 Ja nu ar y-14 Fe b ru ar y-1 4 Issu e s Serious issues Troublesome issues All issues Closed issues

Figure 6.14: Growth of Saadi issues over time (cumulative count per month). Serious represents the severe and blocking issues, and Troublesome represents cosmetic and

troublesome issues.

In the issues growth of Saadi in figure 6.14 it can be seen that due to project development decrease and absence of the system in the test environment, almost no issues are reported between October 2011 and September 2012. The issues growth from January 2013 to March 2013 is because of the pilot preparation. Also, from July 2013 to September 2013 the project was being tested for production release which explains the increase in the insertion of issues. Saadi is taken into production since September 2013.

(42)

Chapter 7

Discussion

This chapter answers and discusses the questions asked in chapter 2 and researched in the chapters that followed. At the end, this chapter contains a discussion about some other issue related findings.

The numbers and percentages are based on 720 issues of Ashqary and 400 issues of Saadi.

7.1

Severity and priority

Q2. “What types of bugs occur the most, categorized by their severity and priority attributes? ”

Assuming that the mentioned priority and severity for the issues are correct, the majority of the issues in both systems are troublesome (41% in Ashqary and 49% in Saadi), followed by severe issues (38% in Ashqary and 31% in Saadi), depicted in figure 6.1. 52% issues of Ashqary and 40% issues of Saadi are either blocking or severe which can have a great impact on stability and testability of these systems. Also, the issues have almost 50% top or high priority which indicates that the issues either cause system failure or cause serious errors (figure 6.2).

The most occurring severe and blocking issues are application integration issues (130 in Ashqary and 60 in Saadi), see tables 6.4 and 6.5. Untagged issues are the next occurring category, followed by environment integration, middleware, deployment and other category of issues mentioned in the same tables.

(43)

Chapter 7. Discussion 36

7.2

Classification

H1. “The industrial bug data is less misclassified than the open-source bug data because of better process arrangement.”

Herzig et al. [7] report that 33.8% of issues of 5 open-source projects that are reported as bugs are actually not bugs. As it can be seen in figures 6.3 and 6.4, 30% issues of Ashqary and 39% issues of Saadi are not classified as bug. These issues are either classified as requirement, documentation, process or wrong test related issues. Assuming that these post-release issues are reported as bugs, the above hypothesis that the industrial bug data is less misclassified than the open-source bug data does not hold in the scope of these two industrial systems.

The bug misclassification in these two systems is possibly influenced by the fact that these issues are mainly reported by system testers. Beside reporting problems, these issues also instill the wishes of the testers. The testers are not involved during the requirements analysis of these systems, but rather in a later phase in delivering of these systems, see 3.4. Also, the documentation of these systems, which should meet the same quality level as the software itself, can explain the high number of documentation issues.

7.3

Manifestation

Q3. “How are the different categories of bugs related to aspects such as integration, configuration, process, documentation, and the different layers like operating system

and middleware? ”

As depicted in figures 6.5 and 6.6, 1/4 of the issues are related to application integration. Application integration issues are detected in the presence of different components of a system or the system as a whole (table 5.3). These issues are most probably not found during development because required components were not present and not integrated before system release. The components of Ashqary and Saadi are distributed over phys-ically different machines including legacy environments, see 3.1 and 3.2. Mostly, the application integration is a collaboration between different parties which can influence the number of integration issues.

Environment integration, deployment, and middleware issues are the other 1/4 of the issues. These issues find their roots in the context in which a sub-system should run. It is not researched how much the lack of right environment plays a role in not detecting

(44)

Chapter 7. Discussion 37 these issues before a system is released. It is also not researched how much focus lies on environment integration before system release. As it is mentioned in 3.4, the operating system and related configuration of the Linux environment is not representative, which might be of influence on not seeing these issues before release.

About 1/4 of the issues are mainly configuration, test tooling and performance related. Functional configuration issues remain the largest category in configuration related is-sues.

Also, 228 issues of Ashqary and 114 issues of Saadi could not be tagged according to criteria in table 5.3. Refining the untagged issues reveal that only half of the untagged issues are bugs, table 6.6. This may indicate that just half of the untagged bugs could have possibly been detected locally, without the need for right environment, right com-ponents or right data. The other half of untagged issues are spread over documentation, wrong testing, requirement and following the process.

7.4

Analysis effort and duration

Q4. “How much analysis effort and resolution time is required for the different categories of bugs? ”

Assuming that the comments are mainly used to analyze an issue, the number of com-ments can be used to get an impression of how much analysis effort is required until an issue is resolved. This assumption is based on the experience during the research which showed that reading the comments is a rich source of information to understand an issue. These comments contain facts (based on logs for example), proposals and ideas for the origin of the issue, solutions and work-arounds, rationales, or simply a status update. On average Ashqary and Saadi issues had respectively 4,9 and 4,3 number of comments per issue, see figures 6.7 and 6.8. Performance, application integration, environment integration, test tooling and application configuration related issues contain more com-ment on average compared to other issues. It holds both for Ashqary and Saadi that application integration related issues occur more often and have more comment than other issues.

Figures 6.9 and 6.10 depict that the average duration of issues for Ashqary and Saadi are respectively 121 and 125 days, also concentrated around these numbers (figures 6.11 and 6.12). The reason for such ‘long’ average resolution times is possibly the distance between the development team and the test team, and the separate plannings that are used by these two teams to solve and retest the issues, 3.4. Also, issues that get lower

(45)

Chapter 7. Discussion 38 priority may not get enough attention to be resolved sooner, resulting in longer resolution times. Furthermore, possible conflicts between the different parties can lead to delay in solving an issue. Additionally, the access to the issue tracking system and the right for closing an issue can result into longer resolution times. Yet another reason might be the complexity of these issues to analyze.

Figures 6.9 with 6.7 and 6.10 with 6.8 may indicate that most of the the categories that need more analysis effort on average possibly take longer to resolve. For example, both in Saadi and Ashqary application integration issues have more comment count and take longer to resolve.

7.5

The possible root causes

Q5. “Based on the research analysis, what are the visible shortcomings of the processes, testing methodologies and techniques at TSE? ”

The analysis of the issues discussed until now and the information on delivering Ashqary (3.4) suggest that there might be two causes for not detecting the post-release issues ear-lier. First, the lack of a representative environment during developing and maintenance. Second, the late occurrence of integration of the system as a whole.

7.5.1 Lack of representative environment

Application integration, environment integration, deployment, and middleware issues suggest that the absence of the system as a whole and the absence of a representative, production-like environment (section 3.4) play a role in not detecting these issues on time before a system is released at TSE. This is confirmed by Tassey [5] who researched the economic impact of inadequate testing infrastructure. In their study in automotive industry, the developers had responded that the majority of issues could be avoided if a production-like environment was present.

The findings also suggest that integration and environment related issues are more diffi-cult to resolve, 6.2.4. In such cases more time and analysis effort is required which might be caused by the complexity of the issues and these types of systems. It as well gets attention to see that between 40% to 55% of environment related issues have blocking or serious impact, see tables 6.4 and table 6.5.

The complexity of the system can be explained by modularization of the system into smaller sub-systems, which are distributed over different geographical locations, running

(46)

Chapter 7. Discussion 39 on different operating systems, and using different technologies, figures 3.1 and 3.2. Related to this Pecchia et al. [20] observes that the costs of integration of a system as a whole which exists out of off-the-shelf modular components and services can be even higher than building the system from scratch.

The lack of a complete representative system may also cause that the development team does not get the opportunity to learn and understand the domain rationales and the system as a whole.

7.5.2 Late integration

As mentioned earlier, the system as a whole is not present, which implies that the system as a whole is not integrated and tested during development and maintenance at TSE. This integration is executed at the customer’s testing environment after the system is released. Such an arrangement provides opportunity for issues to remain unseen at the development stage.

Also, the integration between the development team at TSE and the testing team is not present. Development and test teams are at different locations, these teams do not work synchronously, the issue tracking system is used for communication, and there is no continuous direct face-to-face communication. This results that there is a delayed feedback between the testing team where the issues are found and the development team where the issues should be analyzed and resolved. In a complex system and environment it is easier to identify the cause for an introduced issue if the issue is seen immediately. Immediate communication and reporting of the issue has the benefit that the knowledge of the development team is fresh at that moment for the introduced issue. This will stimulate shorter resolution times, less unusable issues and early detection of issues. If found later, it will perhaps need more analysis and time to solve the issues as explained earlier.

7.6

Early detection

As mentioned in chapter 2 the main focus of this study is to understand the post release issues and propose an improvement.

Q1. “Based on the analysis of industry bug data, what category of bugs can be detected more efficiently before a software system is released at TSE? ”

(47)

Chapter 7. Discussion 40 The Application integration, environment integration, deployment, middleware, and system configuration issues form more than the half of the issues in both Ashqary and Saadi (figures 6.5 and 6.6). As mentioned in 5.3 these issues are found in the presence of multiple components of the system or the system as a whole, and these issues have a direct relation with the environment and context in which they run. Also, the previous chapter explained that the lack of a representative environment and late integration of the system as a whole (after release) are possibly the two main root causes. So, the most advantage might be gained by having a representative environment during development, and integrating earlier and more often. Integrating earlier implies that the test scenarios and knowledge that are used during post release are essential.

The presence of the system as a whole and the right production-like environment brings up the financial discussion, which is out of scope of this study. However, making more use of the customer’s test environment by the development team at TSE might be a possible temporary solution. First, the system as a whole is present there which makes integrating the system as a whole possible. Second, the development and test teams will have a more direct communication which might result in less effort and time to analyze and resolve the issues.

The functional configuration issues which are the biggest configuration and data related category of issues might be avoided. Most of the times inconsistent use of these con-figuration files across the system as a whole cause these issues. For example, while it is expected that all the components should use version 2, some of them might still use version 1, which eventually causes system misbehavior.

7.7

The issue with issues

As mentioned in 5.3, HPQC and perhaps many other issue tracking systems mainly focus on the management meta-data of the issues. From the problem analysis point of view, these issue tracking systems facilitate no help. The important elements for analyzing an issue to locate the problem is the context in which an issue is observed, the steps taken to (re)produce the issue, and the possible inputs and outputs are simply left to the reporter. The reporter may forget these important parameters, or provide incomplete information.

Also, among the issue reporters, there is no widely accepted effective format and way of reporting issues. Everyone reports in a way. Also this aspect adds to the difficulty of analyzing an issue. One of the ways to report an issue is by using the format proposed by Zeller [1].

Referenties

GERELATEERDE DOCUMENTEN

Quatre fragments pr ése ntent des traces d'un engobc gris foncé.. De telles antéfixes étaient bien faites pour occuper une telle place , avec

L'INDUSTRIE LITHIQUE DU SITE RUBANE DU ST ABERG A ROSMEER 1 7 Les retouches affectent plus souvent un bord que !es deux et se prolongent parfois jusqu'a Ja base; elles

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:.. • A submitted manuscript is

De uitvoering van het kwaliteitskader vraagt in het bijzonder een doorontwikkeling van wijkverpleging die sterker uitgaat van de eigen mogelijkheden van de cliënt

Our proposed algorithm is especially accurate for higher SNRs, where it outperforms a fully structured decomposition with random initialization and matches the optimization-based

Initiation complex (small ribosomal subunit + initiation factors) binds DNA and searches for start codon.. Large ribosomal subunit adds to the

kijkhoogte van 2 m is het licht van de Lange Jaap op een afstand van 30 zeemijl niet (rechtstreeks) te zien, omdat de vuurtoren zich dan achter de horizon bevindt.. De maximale

Deze nieuwe rating wordt bepaald met behulp van het aantal punten P dat de speler met de partij scoorde (0 of 0,5 of 1) en de vooraf verwachte score V bij de partij voor de