Mutation Testing in Functional Languages

(1)

Mutation Testing in Functional Languages

Master’s Thesis 2016

Joël James Bartholomew

Graduate School of Informatics Master Software Engineering University of Amsterdam Amsterdam, The Netherlands 2016

(2)

Master’s Thesis 2016

Mutation Testing in Functional Languages

submitted in fulfillment of the requirements for the degree of

Master Software Engineering

JOËL JAMES BARTHOLOMEW

Faculty of Science

(3)

Mutation Testing in Functional Languages

Supervisor:

Prof.dr. D.J.N. van Eijck Host Organization:

Centrum voor Wiskunde en Informatica (CWI)

Masters’s Thesis 2016

(4)

Abstract

When it comes to software, developers must take measures to ensure that their code is as fault-free as realistically possible to have their software function as intended. One of the ways of achieving this is by making use of tests, specifically automated tests. Automated tests provide programmers with a degree of confidence when it comes to the quality of the systems they are developing. However, these tests are only of value if they are of a high enough quality. One way of establishing the quality of tests is by making use of Mutation Testing. Mutation Testing works by attempting to determine how strict a test-suite is by running it against programs which contain known faults. This thesis provides insight into the application of Mutation Testing Analysis for functional languages such as Haskell. A number of mutation operators are also proposed, which are types of faults that can be inserted into programs written in functional languages for the purpose of Mutation Testing.

(5)

Acknowledgements

I would like to thank my supervisor, Jan van Eijck, for supporting me during the writing of this thesis. I received a great deal of advice regarding research methods and the writing of the thesis itself.

(6)

Abstract Acknowledgements Introduction 1 Research questions . . . 1 Research method . . . 2 Summary of chapters . . . 3 1 Related work 4 2 Background 5 2.1 Introduction . . . 5 2.2 Functional programming . . . 5 2.2.1 Haskell . . . 6 2.2.2 Haskell Test-frameworks . . . 6 2.2.2.1 HUnit . . . 6 2.2.2.2 QuickCheck . . . 6 2.2.2.3 Hspec . . . 6

2.3 Establishing the quality of test-suites . . . 7

2.3.1 Test Coverage . . . 7

2.3.2 Mutation Testing . . . 8

2.4 Application of Mutation Testing to functional programming . . . . 10

3 Mutation Operators 11 3.1 Introduction . . . 11

3.2 Mutation Operators supported by MuCheck . . . 11

3.2.1 NIF: The negation of if-statements . . . 12

3.2.2 NPG: The negation of pattern guards . . . 12

3.2.3 RDPM: Reordering for pattern matching and deletion of pat-terns . . . 12

(7)

3.2.5 MPV: Mutating primitive literal values such as Integer,

String, and Bool . . . 13

3.3 Extensions to MuCheck . . . 14

3.3.1 UEDDN: Unit expression deletion in code written using do-notation . . . 15

3.3.2 DEDDN: Deleting expressions in do-notation whose result is discarded. . . 15

3.3.3 RLV: The reordering of elements in list literals . . . 16

3.3.4 DLV: Deleting elements in list literals . . . 16

3.3.5 FLLV: Replacing list literals with their first and last elements 17 3.3.6 IOF: The introduction of new function binding that are mis-spelled version of existing functions. . . 17

3.3.7 RCE: Reordering of case expressions . . . 18

3.3.8 DCE: Deletion of case patterns . . . 18

3.3.9 MBO: Mutation of binary boolean operations . . . 19

3.3.10 MNA: Mutating not applications . . . 19

3.4 Conclusion . . . 20

4 MuCheck Technical Details 21 4.1 Introduction . . . 21

4.2 Mutant generation . . . 21

4.3 Running tests . . . 23

4.4 Implementing Mutation Operators . . . 24

4.5 Conclusion . . . 25 5 Case Studies 26 5.1 Introduction . . . 26 5.2 Triangles . . . 27 5.2.1 Description . . . 27 5.2.2 Analysis . . . 28

5.2.3 Mutant Type Distribution . . . 32

5.3 The Knight’s Tour . . . 33

5.3.1 Description . . . 33

5.3.2 Analysis . . . 35

5.4 IBAN validation . . . 39

5.4.2 Analysis . . . 40

(8)

5.5 Credit card validation . . . 43

5.5.2 Analysis . . . 44

5.6 Domain Specific Language Interpreter . . . 49

5.6.1 Description . . . 49 5.6.2 Mutant Distribution . . . 49 5.7 RSA . . . 50 5.7.1 Description . . . 50 5.7.2 Mutant Distribution . . . 50 5.8 Conclusion . . . 51 6 Discussion 53 6.1 MuCheck evaluation . . . 53 6.2 Conclusion . . . 54 6.3 Future work . . . 55

Appendix: Case Study Programs 56 Triangles . . . 56

Alternative Triangles . . . 60

Knight’s Tour . . . 63

IBAN validation . . . 65

Credit card number validation . . . 69

Interpreter for Domain Specific Language . . . 71

Nofib RSA algorithm implementation . . . 78

(9)

Introduction

Mutation testing is a fault-based testing technique used to evaluate the quality of a test-suite. A number of mutations are introduced into a program and the accompa-nying test-suite is ran with the intention of killing the generated mutants meaning that part of the test-suite fails due to their introduction (Jia & Harman, 2011). A lot of research has been performed on the Mutation Testing of imperative languages but there is currently a lack of research into the application of the technique for functional programming languages. This in turn leads to a lack of mutation oper-ators that are tailored for functional programming languages. Existing mutation operators for imperative languages do not directly translate to functional languages. To apply them to functional languages it is necessary to think differently about their underlying concepts. For example, the act of deleting statements in imperative lan-guages is essentially the elimination of computation thus it is possible to achieve similar results in functional languages by replacing functions of type a → a with the identity function. Some mutation operators for languages such C apply to monadic programming due to its similarity to the imperative world. This leads us to another example which is the deletion of expressions from code written in do-notation.

Research questions

This thesis presents research into suitable mutation operators for functional pro-grams and attempts to answer the following questions:

1. What can one learn from using Mutation Testing for functional programming? 2. What mutation operators would be suitable for functional programming

lan-guages?

3. How does MuCheck (Le, Alipour, Gopinath, & Groce, 2014) perform when it comes to analyzing Haskell programs?

4. What are mutation operators that MuCheck can be extended with? 5. How does the extended MuCheck perform?

(10)

Research method

The first part of the research are a series of case studies providing insight into the usage of Mutation Testing for functional programming and the performance of Mutation Operators. This is performed using a tool known as MuCheck. MuCheck is a framework for the Mutation Testing of Haskell programs, originally published in a conference paper in 2014 (Le et al., 2014). A more in-depth explanation of MuCheck is provided in Chapter 4.

The case studies concern the following Haskell programs: 1. A program which determines the type of a triangle 2. A program which solves the Knight’s tour chess problem 3. A program which validates IBANs

4. A program which validates credit card numbers 5. An interpreter for a domain specific language

6. An implementation of the RSA algorithm taken from the nofib project

1

The programs make use of the HUnit and QuickTest testing frameworks.

The first program was chosen due to it often being used for case studies in Mutation Testing literature. The second is an interesting puzzle that contains a number of logical constraints. The third and fourth programs represent more realistic applica-tions but are still of a short enough length that they can be reasoned about. The fifth program was chosen because functional languages are often used for writing interpreters. The choice for the last program was made due to it being part of real category of the nofib project and it could be massaged into a format that can be analyzed using MuCheck.

The first four studies consist of analyzing the initial specifications of the programs and then using mutation analysis to improve the specification. The remaining stud-ies are used to get additional data that can be used to analyze the proposed exten-sions to MuCheck.

A Haskell program was written for the analysis which uses MuCheck as a library and allows simple configuration of the mutation operators used. MuCheck can use program coverage information to limit the number of mutants generated but coverage information was not used during the case study because the entire programs are being tested.

1_{The nofib repository contains a number of programs which are used to benchmark the Glassglow}

(11)

The second part of this thesis attempts to apply mutation operators to the programs to determine whether or not they are beneficial for use in the Mutation Testing of functional languages. MuCheck is extended with the mutation operators which are transformations performed on to the abstract syntax tree of the program being analyzed. The previously mentioned programs are analyzed with MuCheck and the amount of generated mutants are compared with the amount generated during the case studies.

Summary of chapters

This is a brief outline of what went into each chapter:

• Chapter 1 discusses work that is related to this thesis.

• Chapter 2 gives a background on mutation testing and functional program-ming languages.

• Chapter 3 discusses the existing Mutation Operators in MuCheck and a set of proposed operators for functional languages.

• Chapter 4 discusses the technical details of MuCheck and the implementation of the proposed mutation operators.

(12)

1

Related work

In this section, we present recent work that is related to mutation testing in func-tional programming languages and mutation testing in general. The first four pub-lications are similar this thesis because they examine the application of Mutation Testing to other fields instead of imperative programming.

Gophinath et al. present a Haskell tool known as MuCheck for mutation testing (Le et al., 2014). This thesis uses MuCheck as an analysis tool.

Taylor et al. present a refactoring-based mutation testing framework for Erlang (Taylor & Derrick, 2015).

Granda et al. elaborate on a design for mutation operators for schemas based on unified-modelling language (UML)(Granda, Condori-Fernández, Vos, & Pastor, 2016).

Deng et al. present mutation operators for Android software development (Deng, Offutt, Ammann, & Mirzaei, 2016).

Jia et al. have compiled a survey on all publications related to the topic of Mutation Testing since 1970 (Jia & Harman, 2011).

(13)

2

Background

2.1 Introduction

In this chapter, we provide the information related to functional programming and Mutation Testing that is necessary for the reader to understand this thesis.

2.2 Functional programming

Functional programming is a paradigm that attempts to avoid the changing of state and mutable data. Some of the key motivations behind functional programming are (Hughes, 1989):

• It becomes easier to reason about functions as they always return the same result given the same arguments.

• Achieving modularity by way of higher-order function and function composi-tion.

Some languages are considered to be purely functional but over time popular lan-guages such as Java have gained constructs which allow one to program in a more functional manner.

(14)

2.2.1 Haskell

Haskell is a general-purpose purely functional programming language. The language is named after the logician, Haskell Curry. Haskell uses lazy evaluation by default and has a strong type system which can be used to model a variety of problem domains.

2.2.2 Haskell Test-frameworks

The use of test frameworks may also have an impact on the quality of tests. There are a number of test-frameworks for Haskell. Three of the more popular testing frameworks for Haskell are HUnit, QuickCheck and HSpec.

2.2.2.1 HUnit

HUnit is a unit-testing framework for Haskell based on JUnit (Herington, 2010). HUnit testing works based on the concept of expectations. The programmer specifies a number of steps for each test scenario and verifies that the code is working as intended by comparing output data to expectations.

2.2.2.2 QuickCheck

QuickCheck is a library for automated property-based testing (Claessen & Hughes, 2011). While more example-based testing methods make use of one test case scenario and expected results, property-based testing makes use of automatically generated input data and properties. One property-based testing case is ran multiple times and for each run the specified property must hold. If the property does not hold then QuickCheck will attempt to reduce the value to the smallest possible counter-example and display that. This process is referred to as shrinking. The predicates are validated multiple times using random input data.

2.2.2.3 Hspec

Another popular Haskell testing framework is Hspec (Hengel, 2011). Hspec is a behavior-driven-development framework, inspired by Ruby’s Rspec, that can make use of QuickCheck properties. Its expectation language is based on HUnit. Because

(15)

of its similarity to HUnit and QuickCheck there were no programs analyzed which made use of this framework.

2.3 Establishing the quality of test-suites

To ensure that software functions as intended, it is necessary to make use of auto-mated tests. However, the usage of the tests only has value if the specification itself is of a high quality. The latin phrase, “Quips custodiet ipsos custodes?”, meaning “Who will guard the guards?” is an accurate description of this dilemma. Within the field of Software Testing this is referred to as “The Oracle Problem” (Barr, Harman, McMinn, Shahbaz, & Yoo, 2015). At the moment, the most popular technique used for the determining the quality of test-suites is the Test Coverage metric but there has also been research into another method called Mutation Testing.

2.3.1 Test Coverage

The following is a description of Test Coverage metric variants (Zhu, Hall, & May, 1997):

Statement Coverage In software testing practice, testers are often required to

generate test cases to execute every statement in the program at least once. A test case is an input on which the program under test is executed during testing. A test set is a set of test cases for testing a program. The requirement of executing all the statements in the program under test is an adequacy criterion. A test set that satisfies this requirement is considered to be adequate according to the statement coverage criterion. Sometimes the percentage of executed statements is calculated to indicate how adequately the testing has been performed. The percentage of the statements exercised by testing is a measurement of the adequacy.

Branch Coverage Similarly, the branch coverage criterion requires that all control

transfers in the program under test are exercised during testing. The percent-age of the control transfers executed during testing is a measurement of test adequacy.

Path Coverage. The path coverage criterion requires that all the execution paths

(16)

2.3.2 Mutation Testing

The origin of Mutation Testing is a student paper published by Richard Lipton in 1971 (R. Lipton, 1971). Mutation Testing is a fault-based testing technique which involves creating artificial input data for tests and then running these against the original program’s test-suite. The artificial test input data are referred to as a mutations and are created by injecting faults into the original program. The quality of the test-suite is then determined by looking at the amount of mutants that cause the tests to fail and the amount of mutants that manage to pass all the tests. The result is the “Mutation Adequacy Score” which can be used to represent the quality of a test-suite (Jia & Harman, 2011).

Mutation Testing has limitations in regards to how computationally expensive it can be and that the process can sometimes generate mutants which are semantically equivalent to the original program (Jia & Harman, 2011). Additionally, Mutation Testing requires taking Test Coverage into account to minimize the generation of equivalent mutants due to mutating code that is not tested by the specification. While a number of researchers have looked into methods to deal with the detection of equivalent mutants such as the use of mathematical constraints (A. J. Offutt & Pan, 1996), it is currently not possible to automatically and accurately detect equivalence between two arbitrary programs because it is an undecidable problem (Madeyski, Orzeszyna, Torkar, & Jozala, 2014). This means that extremely stubborn mutants must be checked by a human to determine if they are truly equivalent to the original program. Equivalent mutants are subtracted from the total amount of generated mutants during the calculation of the Mutation Adequacy Score.

Mutation Testing has two variants: Weak and Strong (Woodward & Halewood, 1988). Weak Mutation Testing checks to see if the output of one statement is changed due to mutation. Strong Mutation Testing checks to see if a mutation becomes visible in the output of the program.

Due to the fact that it is theoretically possible to produce an infinite amount of mu-tants from any given program, Mutation Testing has two fundamental hypotheses:

The Competent Programmer Hypothesis (DeMillo, Lipton, & Sayward, 1978)

The theory states that programmers are competent and any faults in a piece of software are often just small variations of the correct program. This leads to the generation of mutants that are only small syntactic variations of the original program. These variations tend to be limited to a small number of token differences.

(17)

The Coupling Effect (DeMillo et al., 1978) (A. Offutt, 1989) (A. J. Offutt, 1992)

The Coupling Effect deals with two classes of mutants:

1. Simple mutants often referred to as first-order mutants, are small syntactic variations in a program.

2. Complex mutants also known as higher-order mutants, are composed of mul-tiple simple mutants. The coupling effect states that “Complex mutants are coupled to simple mutants in such a way that a test data set that detects all simple mutants in a program will also detect a large percentage of the complex mutants”. This is the reason why the majority of Mutation Testing applications only make use of simple mutants.

Of the two hypotheses, only the Coupling Effect has been somewhat investigated empirically (A. J. Offutt, 1992). There has yet to appear evidence that supports the Competent Programmer hypothesis. A study by Gophinath et al. attempts to de-termine how closely Mutation Operators correlate with real faults, their results show that that bug fixes have an average change size of three to five tokens (Gopinath, Jensen, Groce, & others, 2014). The study also showed that the pattern of faults are different across different languages which implies that Mutation Operators are often specific to programming languages. However, they did see a small relationship be-tween their data for faults in Python and Haskell. This is likely due to how Python has a number of facilities for functional programming (Gopinath et al., 2014). A number of tools have been developed for the purpose of Mutation Testing, which are more often than not specialized to one programming language:

• Milu (Jia & Harman, 2008) This a Mutation Testing framework for pro-grams written in the C language. It is notable because in contrast to other Mutation Testing tools it focuses on the generation of higher-order mutants.

• MuJava (Ma, Offutt, & Kwon, 2005) A class-mutation testing frame-work for the Java programming language. The focus of MuJava is on object-oriented mutation operators.

• Mothra (King & Offutt, 1991) One of the earlier Mutation Testing tools written for the fortran77 language.

• Major (Just, 2014) A Mutation Testing framework for Java. It is inte-grated into the OpenJDK Java compiler.

(18)

2.4 Application of Mutation Testing to functional

programming

As said earlier, the majority of Mutation Testing research focuses on constructs from imperative languages. However, recently functional programming languages are be-coming more popular to the point that imperative languages have started extending themselves with functional constructs. An example would be the introduction of the lambda expressions to Java 8. These trends indicate that there needs to be research performed on the application of Mutation Testing to functional languages. MuCheck was developed by Gophinath et al. as a starting point for research into functional languages and Mutation Testing (Le et al., 2014). Haskell is targeted due to it having a mature set of test frameworks and its status as a “laboratory” for functional programming (Le et al., 2014). Another reason that Haskell is a reason-able starting point would be that it has a strong type system that catches certain classes of faults and because of this much of the knowledge of Mutation Testing in Haskell can be applied to other functional languages which do not have as strong a type system.

(19)

3

Mutation Operators

3.1 Introduction

Mutation Operators are transformation rules that are used to modify the original program to produce mutants. There are a number of mutation operators for im-perative languages. An example shown in Listing 3.1 would be the mutation of arithmetic operators. The example is based on the notion that a programmer can accidentally use the wrong operator. In this particular case, the addition operator is replaced with the subtraction operator.

next_line = next_line + 1 \* becomes *\

next_line = next_line - 1

Listing 3.1:C: Mutating arithmetic operators in the C Programming Language

The majority of research pertaining to Mutation Operators has been performed on their usage in analyzing procedural code. There has also been research performed into Object-oriented Mutation Operators but that field is not as mature.

3.2 Mutation Operators supported by MuCheck

In this section, we discuss MuCheck and the types of Mutation operators it sup-ported at the start of this thesis. The publication which introduces MuCheck

(20)

men-tions the following operators: reordering pattern matching, mutamen-tions of lists and list expressions and type aware function replacement (Le et al., 2014).

3.2.1 NIF: The negation of if-statements

This mutation operator is meant to mimic how a programmer can make an error in specifying logic and invert the conditions for expressions. The transformation is performed by swapping the then-expression with the else-expression.

isOne :: Int -> Bool

isOne x = if x == 1 then True else False

-- becomes

isOne :: Int -> Bool

isOne x = if x == 1 then False else True

Listing 3.2: Haskell: Negation of if-expressions

3.2.2 NPG: The negation of pattern guards

Similar to the previous operator, this transformation negates clauses to mimic mis-takes made in specifying logic.

f x | x == 0 = True

| otherwise = False

-- becomes

f x | not (x == 0) = True

| otherwise = False

Listing 3.3:Haskell: Negation of pattern guards

3.2.3 RDPM: Reordering for pattern matching and deletion

of patterns

In Haskell, the order of function pattern bindings can be used to indicate precedence. This means that a programmer can introduce a fault when determining the order of patterns. This mutation operators changes the order of function pattern declarations.

(21)

A negative aspect of the this operator’s current implementation in MuCheck is that it generates a mutant for all permutations. This leads to it creating a high number of mutants depending on the amount of bindings of a function.

take 0 _ = [] take _ [] = [] take n (x:xs) = x : take (n-1) x -- becomes take ' _ [] = [] take ' 0 _ = [] take ' n (x:xs) = x : take '(n-1) x

Listing 3.4: Haskell: Reordering of pattern matching

3.2.4 TFR: Type-aware function replacement

This is a mutation operator that allows the user of the Mutation Testing tool to specify functions that can be swapped. The most common usage is the swapping of logical and arithmetic operators. This is because in Haskell operators are simply functions that can be applied using infix notation.

isZeroOrOne :: Int -> Bool

isZeroOrOne x = if (x == 0) || ( x== 1) then True else False

-- becomes

isZeroOrOne :: Int -> Bool

isZeroOrOne x = if (x == 0) && ( x== 1) then True else False

Listing 3.5: Haskell: Function replacement

3.2.5 MPV: Mutating primitive literal values such as

Integer, String, and Bool

The transformation rule is intended to mimic typing errors that are made by pro-grammers when writing literal values.

oneToTen :: [Integer] oneToTen = [1..10]

(22)

-- becomes

oneToTen :: [Integer] oneToTen = [2..10]

Listing 3.6:Haskell: Mutating literals

There are some limitations to the above mentioned operators as such as there being no operators that specifically target code written using do-notation. Another point is that negating if-statements and pattern-guards does not cover all logical errors. There should be more mutation operators to provide more coverage for different cases of logical errors.

3.3 Extensions to MuCheck

To supplement the operators MuCheck already supports, a number of mutation oper-ators are proposed. For ease of reference they each have an associated abbreviation. They are grouped into the following categories:

Monadic code These are mutation operators for code that makes use of monads,

to be more specific it refers to the mutation of code written using do-notation. Do-notation code has similarities to imperative programming.

• UEDDN: Unit expression deletion in code written using do-notation • DEDDN: Deleting expressions in do-notation whose result is discarded

Lists The mutation of list literals are described in MuCheck’s publication(Le et

al., 2014), however the tool itself did not have any operators implemented to perform the mutations.

• RLV: The reordering of elements in list literals • DLV: Deleting elements in list literals

• FLLV: Replacing list literals with their first and last elements

Boolean logic These operators modify boolean functions to mimic faults in logic.

• MBO: Mutation of binary boolean operations • MNA: Mutating not applications

Pattern bindings This category contains operators that have an affect on pattern

matching in a similar way to the ones proposed by the authors of MuCheck and are generally applicable to all functional programs.

(23)

• IOF: The introduction of a new function binding that is a misspelled version of another function

• RCE: The reordering of case expression patterns • DCE: The deletion of case expression patterns

3.3.1 UEDDN: Unit expression deletion in code written

using do-notation

In code written in do-notation it is possible to have expressions that do not return a meaningful result (I.E: () or unit). This means that the expressions can be deleted without causing compilation errors. This type of fault is reminiscent of statement deletion in imperative languages and is not caught by the Haskell compiler.

doIO :: IO () doIO = do print "Hello" print "World" return () -- becomes doIO :: IO () doIO = do print "Hello" return ()

Listing 3.7: Haskell: Deletion of expressions in do-notation

3.3.2 DEDDN: Deleting expressions in do-notation whose

result is discarded.

In Haskell, it is possible to discard the result of an expression in do-notation by binding its result into the wild card operator, “_“. Doing this implies that none of the expressions below depend on the result of the computation and the expression can be deleted without introducing compiler errors.

doIO :: IO () doIO = do

(24)

print "Hello, World"

return ()

-- becomes

doIO :: IO () doIO = do

print "Hello, World"

return ()

Listing 3.8: Haskell: Deletion of expressions in do-notation whose result is discarded

3.3.3 RLV: The reordering of elements in list literals

Lists are data structures in which the order of elements matters. Reordering the elements in list literals can cause different behavior in functions that depend on the order. However, this has the downside of generating equivalent mutants in the situation that none of the functions using the value care about the order of the list’s elements. This operators shares a downside with the reordering of pattern matching, it generates one mutant for each permutation of a list which can lead to a blow up of mutants. It is better to produce only a few variations of the list. But the appropriate amount of variations for any particular list is still an open question and for the purpose of this thesis we use the naive implementation which simply calculates all permutations.

let f = ["a", "b", "c"]

-- becomes

let f = ["a", "c", "b"]

Listing 3.9: Haskell: Reordering of elements in list literals

3.3.4 DLV: Deleting elements in list literals

In contrast to the previous mutation operator, it is speculated that the deletion of elements in a list is less likely to generate equivalent mutants. This is because any function which operates on a list is guaranteed to perform less operations if the list shortened.

(25)

-- becomes

let f = ["a", "b"]

Listing 3.10: Haskell: Deletion of elements in list literals

3.3.5 FLLV: Replacing list literals with their first and last

elements

This mutation operator is discussed by the authors of the MuCheck publication but it does not already exist in the tool. A list literal is replaced with new lists containing solely the first and last element.

let f = ["a", "b", "c"]

-- becomes

let f = ["a"]

Listing 3.11: Haskell: Replacing list literals with lists of their first and last elements

3.3.6 IOF: The introduction of new function binding that

are misspelled version of existing functions.

This fault is based on the notion of a programmer making a spelling mistake and thus defining two different functions instead of pattern matching on the arguments of one. This mutation can be caught at compile-time with certain flags but it will otherwise generate a run time error. If the test-suite does not issue an error, that is an indication that not all patterns appear in the tests and that there is a gap in the suite. factorial 0 = 1 factorial n = n * factorial (n-1) -- becomes factorial 0 = 1 factoiral n = n * factorial (n-1)

(26)

3.3.7 RCE: Reordering of case expressions

This operation is similar to the reordering of pattern matches in function bindings but it is applied to case expressions. It also has the downside of generating a large amount of mutants depending on the amount of cases.

data Color = Red | Blue | Green colorName :: Color -> String

colorName x = case x of

Red -> "Red"

Blue -> "Blue"

_ -> "Green"

-- becomes

colorName :: Color -> String

_ -> "Green"

Red -> "Red"

Blue -> "Blue"

Listing 3.13: Reordering of case expressions

3.3.8 DCE: Deletion of case patterns

In Haskell, it is possible to have non-exhaustive case expressions. This means that a programmer may forget to add a particular pattern or accidentally delete an existing one.

data Color = Red | Blue | Green colorName :: Color -> String

Red -> "Red"

Blue -> "Blue"

_ -> "Green"

(27)

colorName :: Color -> String

Red -> "Red"

_ -> "Green"

Listing 3.14: Deletion of case patterns

3.3.9 MBO: Mutation of binary boolean operations

It is possible to cover more cases in the truth table of a boolean operation by replacing it with the value of its left and right-hand side. This mutation compliments the negation of conditional expressions.

f :: Bool -> Bool f x = True || x -- becomes f :: Bool -> Bool f x = True -- and f :: Bool -> Bool f x = x

Listing 3.15: Mutation of boolean expressions

3.3.10 MNA: Mutating not applications

While MuCheck does have operators which negate boolean literals, if-expressions and pattern guards. It does not possess any operations for the “not” function which is arguably used often when handling boolean values and functions. Composing the not function with itself will cover cases where the function is passed as a value and then subsequently applied under a different name.

f :: Bool -> Bool

f x = not x

-- becomes

(28)

f x = (not.not) x

Listing 3.16:Mutating applications of the not function

3.4 Conclusion

This chapter should have clarified the concept of mutation operators. In a later chap-ter, a number of programs will be analyzed using the operators currently supported by MuCheck. The results will be used to improve the corresponding test-suite. The proposed mutation operators will also be compared by looking at the number of additional mutants that are generated. This will allow for the evaluation of the extension to MuCheck.

(29)

4

MuCheck Technical Details

4.1 Introduction

This chapter is intended to provide information on the implementation details of MuCheck and provide an example of the how Mutation Operators can be imple-mented. MuCheck uses an AST (Abstract Syntax Tree) based approach to generate the mutations. Haskell source code is parsed using the haskell-src-exts package and which also provides an AST to work with. The AST is manipulated using the Scrap Your Boilerplate library (Lämmel & Jones, 2003).

4.2 Mutant generation

MuCheck provides a domain-specific language for the specifying of Mutation Oper-ators. The framework defines the MuOp datatype to represent Mutation OperOper-ators. The types on the right-hand-side of the data declaration are all from the haskell-src-exts package. The Mutable typeclass has the function (==>) which can be used to construct MuOp values. Additionally, the typeclass defines a number of other functions in terms of (==>).

1 -- | MuOp constructor used to specify mutation transformation

2 data MuOp = N (Name_, Name_)

(30)

11 class Mutable a where

12 (==>) :: a -> a -> MuOp

13

14 -- | The function `==>*` pairs up the given element with all elements of the

15 -- second list, and applies `==>` on them.

16 (==>*) :: Mutable a => a -> [a] -> [MuOp]

17 (==>*) x = map (x ==>)

18

19 -- | The function `*==>*` pairs up all elements of first list with all elements

20 -- of second list and applies `==>` between them.

21 (*==>*) :: Mutable a => [a] -> [a] -> [MuOp]

22 xs *==>* ys = concatMap (==>* ys) xs

Listing 4.1: MuOp data declaration

The users of MuCheck can provide a Config value to configure the analysis. The options allow the user to determine maximum number of mutants to be generated and the sample size for each type of mutant. There is also experimental support for the generation of higher order mutants.

While using the version of MuCheck with support for additional Mutation Operators, it became apparent that the tool ignored any user provided Config and simply used a default value. For the purpose of this thesis MuCheck was modified so that a user provided configuration would actually be used for Mutation Analysis. Additionally, another modification was done to increase performance. The modification prevents MuCheck from generating any mutants for types with a sample value of 0.0.

1 data Config = Config {

2 muOp :: [FnOp]

3 , doMutatePatternMatches :: Rational

(31)

5 , doMutateFunctions :: Rational 6 , doNegateIfElse :: Rational 7 , doNegateGuards :: Rational 8 , maxNumMutants :: Int 9 , genMode :: GenerationMode } 10 deriving Show

Listing 4.2:Config datatype declaration

4.3 Running tests

MuCheck runs tests using the hint package (Authors, 2007), a Haskell interpreter. The mutants are loaded into the interpreter and then the appropriate test runner function is called.

A small caveat is that the hint package is incompatible with the haskell-src-exts package. This is why the mutants MuCheck generates are written to disk in a folder named .mutants located in the current working directory.

MuCheck is currently limited to only being able to analyze one Haskell module. The module must also contain all of the test code. The tests in a module need to be annotated in the form of:

1 {-# ANN test1 "Test" #-}

2 test1 = TestCase (assertBool "Test" True)

Listing 4.3: An example of an annotated test case

This allows MuCheck to exclude the test code from the mutation process.

A typeclass, Summarizable, allows MuCheck to be extended with support for other testing frameworks.

1 -- | Interface to be implemented by a test framework

2 class Typeable s => Summarizable s where

3 -- | Summary of test suite on a single mutant

4 testSummary :: Mutant -> TestStr -> InterpreterOutput s -> Summary

5 -- | Was the test run a success

6 isSuccess :: s -> Bool

7 -- | Was the test run a failure

8 isFailure :: s -> Bool

(32)

10 -- | Was the test run neither (gaveup/timedout)

11 isOther :: s -> Bool

12 isOther x = not (isSuccess x) && not (isFailure x)

Listing 4.4:The Summarizable typeclass which can leveraged be to support other testing frameworks

4.4 Implementing Mutation Operators

The selectValOps is a higher-order function that traverses the AST of a module and uses a predicate function to identify values as a mutation candidates. The function then uses the provided mapping function to generate the appropriate MuOp values.

1 selectValOps :: (Typeable b, Mutable b) => (b -> Bool) -> (b -> [b]) -> Module_ -> [MuOp]

2 selectValOps predicate f m = concat [ x ==>* f x | x <- vals ]

3 where vals = listify predicate m

Listing 4.5: The selectValOps function

The selectBoolOps function is an example showing how the transformation rules can be implemented. The isBoolOp function identifies all applications of the logical && and || functions. The function convert replaces them with the values of their left-hand and right-hand side.

1 selectBoolOps :: Module_ -> [MuOp]

2 selectBoolOps m = selectValOps isBoolOp convert m

3 where isBoolOp :: Exp_ -> Bool

4 isBoolOp (InfixApp _ _ (QVarOp _ (UnQual _ (Symbol _ n ))) _) = n `elem` ["&&", "||"]

5 isBoolOp _ = False

6 convert (InfixApp _ lhs (QVarOp _ (UnQual _ (Symbol _ _ ))) rhs) = [lhs, rhs]

7 convert _ = []

(33)

4.5 Conclusion

This chapter should clarify how the set of mutation operators from Chapter 3 are implemented using the Mutable typeclass and selectValOps function. Additionally, some of the limitations of MuCheck should have become apparent such as it requiring manual work for the annotation of test functions and its inability to analyze more than one module. The fact that MuCheck did not properly propagate user provided configurations was troublesome for this thesis and required modifying the tool.

(34)

5

Case Studies

5.1 Introduction

This section shows the results of the case studies using MuCheck. The first four case studies go very in-depth in improving their test suite using MuCheck while the remaining studies are solely used to evaluate the performance of the proposed extensions to MuCheck. All programs have been slightly adapted from their original versions due to the limitations of MuCheck. MuCheck can only analyze one Haskell module at a time and all tests in the module must be annotated to exclude them from the mutant generation process. The test cases that are written during the case studies are created by looking at gaps in the functions covered by the specification and by examining some of the generated mutants individually. Equivalent mutants are dealt with by first raising the mutant kill percentage as high as possible and then manually examining the stubborn mutants that remain to determine whether or not they are equivalent to the original program. Of the first four case studies, two programs are tested using QuickCheck and the other two are tested using HUnit. We think it is reasonable to believe that property-based testing suites will achieve higher Mutation Adequacy Scores than unit-testing suites during the case studies. For the purpose of analyzing mutant distribution, the types of mutants are sorted into the categories in Table 1 and Table 2.

(35)

Tab. 1: (Default Mutation Operators)

Type Operators

Value Mutation MPV

Pattern Matching RDPM

Type aware Function Replacement TFR

Negation of Guards NPG

Negation of If-expressions NIF

Tab. 2: (Proposed Mutation Operators)

Type Operators

Boolean Operations MBO

Do-notation Expression Deletion UEDDN, DEDDN

Case Expressions RCE, DCE

Function Bindings IOF

List literal mutation RLV, DLV, FLLV

Mutating not applications MNA

5.2 Triangles

5.2.1 Description

The following is program that determines the type of a triangle based on the length of its sides. It uses HUnit as a testing framework and each of the tests just give the function certain parameters and have expectations that the results should satisfy.

1 triangle :: Integer -> Integer -> Integer -> Shape

2 triangle x y z 3 | x + z <= y || x + y <= z || z + y <= x 4 = NoTriangle 5 | x == y && x == z 6 = Equilateral 7 | x == y || x == z || y == z 8 = Isosceles 9 | x ^ 2 + y ^ 2 == z ^ 2 || x ^ 2 + z ^ 2 == y ^ 2 || y ^ 2 + z ^ 2 == x ^ 2

(36)

10 = Rectangular

11 | otherwise

12 = Other

Listing 5.1: Triangle: Code

1 pythagoreanTriples :: Integer -> [(Integer, Integer, Integer)]

2 pythagoreanTriples n = [(x, y, z) | x <- [1 .. n] 3 , y <- [1 .. n] 4 , z <- [1 .. n] 5 , x <= y 6 , (x ^ 2) + (y ^ 2) == (z ^ 2)] 7 8 testRightTriangle :: Bool 9 testRightTriangle =

10 all (\(x, y, z) -> triangle x y z == Rectangular) (pythagoreanTriples 50)

11

12 test1 = TestCase (assertBool "Rectangular" testRightTriangle)

13 test2 = TestCase (assertEqual "Isosceles" (triangle 6 6 4) Isosceles)

14 test3 = TestCase (assertEqual "NoTriangle" (triangle 0 1 2) NoTriangle)

15 test4 = TestCase (assertEqual "Equilateral" (triangle 4 4 4) Equilateral)

Listing 5.2:Triangles: Initial specification

5.2.2 Analysis

MuCheck provides us with the following results: Total mutants: 104 (basis for %)

Covered: not provided Sampled: 104

Errors: 6 (5%) Alive: 17/98

Killed: 81/98 (82%)

Listing 5.3: Triangles: Initial MuCheck results

While the percentage of killed mutants is impressive, the test-suite can likely still be improved. The generated mutants show that the tests allow the incorrect

(37)

classi-fication of triangles as equilateral and isoceles. To fix this more test cases need to be added for the different triangles types. The first of these are intended to ensure that certain triangles should not be classified as equilateral and isoceles. However, MuCheck reports no difference in the strength of the test-suite.

1 notEquals :: Integer -> [(Integer, Integer, Integer)]

2 notEquals n = [(x, y, z) | x <- [1 .. n] 3 , y <- [1 .. n] 4 , z <- [1 .. n] 5 , x /= y 6 , y /= z 7 , z /= x 8 ] 9 10 testNotEquilateral :: Bool

11 testNotEquilateral = all (\(x, y, z) -> triangle x y z /= Equilateral) (notEquals 50)

12

13 testNotIsoceles :: Bool

14 testNotIsoceles = all (\(x, y, z) -> triangle x y z /= Isosceles) (notEquals 50)

15

16 test5 = TestCase (assertBool "Equilateral" testNotEquilateral)

17 test6 = TestCase (assertBool "Isoceles" testNotIsoceles)

Total mutants: 104 (basis for %) Covered: not provided

Sampled: 104 Errors: 6 (5%) Alive: 17/98

Killed: 81/98 (82%)

Attempting to improve the results requires examining the surviving mutants and constructing test cases that target them. Listing 5.4 shows two examples of the mutations that survive.

1 -- Example 1

2 triangle x y z

3 | x + z < y || x + y <= z || z + y <= x = NoTriangle

(38)

5 | x == y || x == z || y == z = Isosceles 6 | x ^ 2 + y ^ 2 == z ^ 2 || 7 x ^ 2 + z ^ 2 == y ^ 2 || y ^ 2 + z ^ 2 == x ^ 2 8 = Rectangular 9 | otherwise = Other 10 11 -- Example 2 12 triangle x y z 13 | x + z <= y || x + y <= z || z * y <= x = NoTriangle 14 | x == y && x == z = Equilateral 15 | x == y || x == z || y == z = Isosceles 16 | x ^ 2 + y ^ 2 == z ^ 2 || 17 x ^ 2 + z ^ 2 == y ^ 2 || y ^ 2 + z ^ 2 == x ^ 2 18 = Rectangular 19 | otherwise = Other

Listing 5.4:Triangles: Examples of surviving mutants

The first example shows that the <= constraint in the guard was swapped with the < function. The second shows that + was replaced with *. This shows that just testing that the triangle returns the correct Shape for the supplied argument is insufficient because the test may pass when the arguments are in certain positions but not in others. For each test case that expects a certain type of triangle there should be additional test cases that expect the same results but the arguments to the triangle function are supplied in a different order. Listing 5.5 shows an example of this. The test cases: test7 and test8 increase kill percentage to 83%. As expected, they kill some of the mutants that survive depending on the order of the arguments.

Listing 5.5:Triangles: Examples of surviving mutants

These are the results after adding 29 more similar test cases: Total mutants: 104 (basis for %)

Covered: not provided Sampled: 104

Errors: 6 (5%) Alive: 0/98

(39)

Killed: 98/98 (100%)

In total 35 test cases were written so that 100% mutants are killed. This gives us a mutation Adequacy store of 100%. The full list of test cases is available in the source code in the appendix.

Another thing that is worth nothing is how we see that order of the arguments should not affect the output of the triangle function. This is known as the transposition property. A total of 35 test cases were written to kill 100% of the mutants mostly because many of them were simply variations of one another with only the order of arguments changed. We can assume that it would have required less code to kill all of the mutants using more a property-based testing approach, E.G. using a framework like QuickCheck instead of HUnit. Indeed, an alternative test-suite was also created using QuickCheck instead of HUnit which makes use of the properties of congruence, transposition and properties of the different triangles. The alternative test-suite manages to kill 100% of mutants while at the same time requiring less code to be written. 1 pythagoreanTriples n = [(x, y, z) | x <- [1 .. n] 2 , y <- [1 .. n] 3 , z <- [1 .. n] 4 , (x ^ 2) + (y ^ 2) == (z ^ 2) 5 ] 6

7 transpose s x y z = (triangle x y z == s) && prop_transpose1 x y z

8

9 transpose' s x y z = (triangle x y z /= s) && prop_transpose1 x y z

10

11 genRightTriangle = elements (pythagoreanTriples 50)

12

13 prop_multN (Positive n) x y z = triangle x y z == triangle (n*x) (n*y) (n*z)

14

15 prop_transpose1 x y z = (triangle x y z == triangle y z x) &&

(triangle x y z == triangle y x z) && (triangle x y z == triangle z y x) && (triangle x y z == triangle z x y) && (triangle x y z == triangle x z y)

16

17 prop_inequality1 (Positive n) = transpose NoTriangle n n' n'

(40)

19

20 prop_inequality2 (Positive n) = transpose NoTriangle n n' n'

21 where n' = (n - 1) `div` 2

22

23 prop_isosceles (Positive x) (Positive y) = x > y ==> transpose

Isosceles x y x

24

25 prop_equilateral (Positive n) = transpose Equilateral n n n

26

27 prop_notEquilateral (Positive x) (Positive y) = x > y ==> transpose' Equilateral x y x

28

29 prop_rec = forAll genRightTriangle (\(x,y,z) -> transpose Rectangular x y z)

30

31 prop_notRec1 (Positive x) (Positive y) (Positive z) = x^2 + y^2 < z^2 ==> transpose' Rectangular x y x

32

33 prop_notRec2 (Positive x) (Positive y) (Positive z) = x^2 + y^2 > z^2 ==> transpose' Rectangular x y x

Listing 5.6:Triangles: Alternative QuickCheck test-suite

In conclusion, we see that mutation testing makes it possible to identify the gaps in specification and thus make it easier to improve it. It was possible to kill 100% of the generated mutants indicating that MuCheck did not generate any mutants that were equivalent to the original program. The property-based test-suite manages to kill 100% of the mutants generated while containing less test cases than the HUnit suite. This may mean that Mutation Testing favors property-based testing as opposed to unit testing because properties can kill classes of mutants.

5.2.3 Mutant Type Distribution

The majority of mutants generated using MuCheck’s default set belonged to the

Type aware Function Replacement category (Table 3). The program makes

heavy use of function symbols so the replacement of these symbols was to be ex-pected. The pattern matching and negation of if-expressions operators were not applied at all. This is because the program only consisted of one function and made

(41)

no use of pattern matching. At the same time, the program only made use of guards and no if-statements.

Tab. 3: (Triangles: Default MuCheck Mutation Operators)

Type Amount Percentage

Value Mutation 27 26%

Pattern Matching 0 0%

Type aware Function Replacement 73 70.2%

Negation of Guards 4 3.8%

Negation of If-expressions 0 0%

Running MuCheck once again using the newly proposed mutation operators only manages to generate mutants for the Boolean Operations mutation category. The total number mutants generated is 14 with 3 of them managing to still survive the test-suite (Table 4). The Boolean Operations mutation operators reveals more hard to catch faults in the specification.

Tab. 4: (Triangles: Proposed Mutation Operators)

Boolean Operations 14 100%

Do-notation Expression Deletion 0 0%

Case Expressions 0 0%

Function Bindings 0 0%

List literal mutation 0 0%

Mutating not applications 0 0%

5.3 The Knight’s Tour

5.3.1 Description

This program is designed to solve the Knight’s tour chess problem which is a type of graph problem. To be more specific it is a Hamiltonian path problem (Axel Conrad, 1994). The goal is to find a sequence of chess moves that a knight can make so that it visits every square on the board exactly once. The program only tries find the

(42)

first full path and does not attempt to find all of the different paths a knight can take. The program is tested using QuickCheck properties.

1 board :: Int -> [ChessPosition]

2

3 availableMoves :: ChessPosition -> [ChessPosition]

4

5 knightTour :: ChessPosition -> [ChessPosition]

6

7 knightTour' :: [ChessPosition] -> [ChessPosition]

Listing 5.7:Knight’s Tour: Top-level type definitions

The board function creates a list of squares on a chess board based on the size provided. The availableMoves function returns a list of moves a knight can make at the given position. Lastly, the knightTour function finds the solution to chess problem for the provided start position using the knightTour' function that calls it self recursively to find the path.

The program was originally tested using the following properties. The introduction of the genPosition function was necessary due to a limitation of MuCheck not being able to exclude the instance class declaration as a candidate for mutation.

1 genChessPosition = do

2 x <- choose (1, 8)

3 y <- choose (1, 8)

4 return (ChessPosition x y)

5

6 instance Arbitrary ChessPosition where

7 arbitrary = genChessPosition

8

9 isOnBoard (ChessPosition x y) = onBoard x && onBoard y

10 where

11 onBoard z = z >= 1 && z <= 8

12

13 prop_boardSizeCorrect (Positive x) = length (board x) == x^2

14

15 prop_availableMovesOnBoard s = all isOnBoard (availableMoves s)

16

17 prop_length s = length path <= 8*8 && (not.null) path

(43)

19 path = knightTour s

20

21 prop_unique s = nub path == path

22 where

24

25 prop_allOnBoard s = all isOnBoard path

26 where

28

29 prop_jump move@(ChessPosition x y) = all (intersects move) (availableMoves move)

30 where intersects (ChessPosition a b) (ChessPosition c d) = abs (a -c) /= abs (b - d)

Listing 5.8: Knight’s Tour: Initial test-suite

5.3.2 Analysis

Analyzing the program using the MuCheck showed that only 58% of mutants are killed by the test-suite. The errors are due to MuCheck replacing Num functions on Integers with the / function which results in type errors. Examining the surviving mutants shows that the comparison operator used in the availableMoves function was changed. This implies that simply verifying that all of the moves it generates are on the chess board with prop_availableMovesOnBoardis not sufficient. This can be remedied by specifying an additional property indicating that availableMoves should always returns at least one possible move.

prop_notNullAvailableMoves x = (not.null) (availableMoves x)

Listing 5.9:The notNullAvailableMoves property

Killed: 18/31 (58%)

(44)

Despite adding the property, analyzing the program with MuCheck produces the same results. MuCheck still manages to generate mutants that survive the tests. This means that the property added was likely too permissive and should be stronger. Instead of just ensuring that there are available moves at every position on the board, we should specify a range for the number of moves the Knight can make. On an 8 by 8 board, a Knight piece can have anywhere between 2 to 8 possible moves on any square. The property increases the kill rate to 87%.

1 prop_numberOfMovesAvailableMoves x = length (availableMoves x) `elem` range

2 where range = [2..8]

Listing 5.11: The numberOfMovesAvailableMoves property

Killed: 27/31 (87%)

Listing 5.12:Knight’s Tour: MuCheck results with the number of available moves property

The new property manages to kill the mutations of availableMoves that result in it producing an incorrect number of chess moves.

As an additional note, if we examine the specification closely we can see that prop_boardSizeCorrect and prop_length properties look very similar and may very well be testing the same things. To test this prop_boardSizeCorrect was re-moved and it resulted in no difference in the results given by MuCheck. This implies that the prop_length property subsumes the prop_boardSizeCorrect property. MuCheck made it possible to identify that there were tests that did not actually contribute to the strength of the test-suite.

Only four mutants remain which cannot be killed due to them being equivalent to the original program. The first mutant changes the order of the bindings of the knightTour' function. In Haskell one can use the order of function bindings to indicate precedence in the case of overlapping patterns but in this program the patterns do not overlap. Thus changing the order of the bindings does not change the semantics of the program.

-- Original

(45)

knightTour' [] = []

knightTour' moves@(currPos : _)

| null possibleMoves = reverse moves

| otherwise = knightTour' $ nextMove : moves

where findMoves s = availableMoves s \\ moves possibleMoves = findMoves currPos

nextMove = minimumBy (comparing (length . findMoves)) possibleMoves

-- Mutant

knightTour' :: [ChessPosition] -> [ChessPosition] knightTour' moves@(currPos : _)

| null possibleMoves = reverse moves

| otherwise = knightTour' $ nextMove : moves

where findMoves s = availableMoves s \\ moves possibleMoves = findMoves currPos

nextMove = minimumBy (comparing (length . findMoves)) possibleMoves

knightTour' [] = []

Listing 5.13:The mutation of the knightTour’ function

The second mutant was created from the same function except that instead of chang-ing the order, the bindchang-ing for the empty list pattern was removed entirely. The mutant survives because none of the test cases ever call the knightTour' with an empty list. This highlights that Test Coverage is actually a prerequisite for Muta-tion Testing because mutating part of a program that is never tested (or ran) will always result in an equivalent mutant in regards to the specification.

The last two mutants were mutations of the availableMoves function. The addition operators is swapped with the subtraction operator. The original function calculates all moves available to a knight piece by adding the value of v to the provided position. Both positive and negative versions of both numbers (1 and 2) are present in the list v. Due to this, the addition and subtraction operators can be swapped and the end result will still contain the same elements except in a different order. The elements being in a different order does not affect the rest of the program. Taking these four equivalent mutants into account, calculating the Mutation Adequacy Score of the suite gives us a value of 100%.

(46)

availableMoves (ChessPosition x y) = filter (`elem` board bSize) ls

where v = [1, -1, 2, -2] ls

= [ChessPosition (x - i) (y + j) | i <- v, j <- v, abs i /=

abs j]

availableMoves :: ChessPosition -> [ChessPosition]

availableMoves (ChessPosition x y) = filter (`elem` board bSize) ls

where v = [1, -1, 2, -2] ls

= [ChessPosition (x + i) (y - j) | i <- v, j <- v, abs i /=

abs j]

Listing 5.14: The mutation of the ‘availableMoves‘ function

5.3.3 Mutant Type Distribution

Most of the generated mutants belong to the Value Mutation category. See Table 5. This is because the program makes use of a list of integers to calculate the movements of the Knight piece and the Value Mutation operators are applied to every value in the list. Literal values are also used to generate the positions on a chessboard and for the size of the board.

Tab. 5: (Knight’s Tour: Default MuCheck Mutation Operators)

Value Mutation 18 54.5%

Pattern Matching 3 9.1%

In a similar way, the proposed set of Mutation operators generate almost 100% more mutants of which 90% belong to the List Literal mutation category 23 of which survive the tests (Table 6). The surviving mutants are all permutations of the list used in the availableMoves function. For the purpose of the program, the order of the elements in that list is irrelevant which means that all of the surviving mutants

(47)

are equivalent to the original program. The proposed set of mutation operators are unlikely lead to any improvements in the specification.

Tab. 6: (Knight’s Tour: Proposed Mutation Operators)

Function Bindings 1 3.1%

List literal mutation 31 96.9%

5.4 IBAN validation

5.4.1 Description

The program is intended to validate a provided IBAN (Listing 5.4.1). This is done using the standard operations which involves first converting the IBAN into an integer and then calculating the result of a mod-97 operation on the integer. If the result is equal to 1 then the IBAN is valid.

1 intToList :: Int -> [Int]

2

3 removeNonAlphaNumeric :: String -> String

4

5 moveFirstFourToBack :: String -> String

6

7 validChars :: String

8

9 convertLetterToNum :: Char -> Int

10

11 toDigits :: String -> [Int]

12

13 fromDigits :: [Int] -> Integer

14

15 mod9710 :: Integer -> Bool

(48)

17 validateIban :: String -> Bool

5.4.2 Analysis

The program was tested using the following properties:

1

2 newtype Iban = Iban String deriving Show

3

4 getRandomValidChar :: Gen Char

5 getRandomValidChar = elements validChars

6

7 genRandomString :: Int -> Gen String

8 genRandomString n = resize n (listOf getRandomValidChar)

9

10 genTestIban :: Gen Iban

11 genTestIban = do

12 y <- choose (0, 10)

13 x <- genRandomString y

14 return (Iban x)

15

16 instance Arbitrary Iban

17 where arbitrary = genTestIban

18

19 prop_isTwoDigitNumber :: Property

20 prop_isTwoDigitNumber = forAll getRandomValidChar ((\x -> x > 9 && x < 100).convertLetterToNum)

21

22 prop_samelength :: Iban -> Bool

23 prop_samelength (Iban x) = length (moveFirstFourToBack x) == length x

24

25 prop_mod97 :: Integer -> Bool

26 prop_mod97 x = mod9710 (x * 97 + 1)

27

28 prop_int :: Positive Integer -> Bool

29 prop_int (Positive x) = (fromDigits.digits) x' == x

(49)

The properties ensure that all characters results in two digit numbers. Moving the first four elements of a list to the back should result in a list of the same length. Adding one to the product of a number and 97 should pass the the mod9710 test. Converting a number to digits and back again should yield the same number. Analyzing the program gives us the following results:

Killed: 230/398 (57%)

Adding the following properties increases the percentage of killed mutants to 97%:

1 prop_sumValidChars :: Property

2 prop_sumValidChars = forAll getRandomValidChar (\x -> calcSum (delete

x validChars) == (585 - convertLetterToNum x))

3 where calcSum = foldr (\y m -> m + convertLetterToNum y) 0

4

5 prop_onlyHandlesValidChars :: Char -> Property

6 prop_onlyHandlesValidChars x = (x `notElem` validChars) ==>

convertLetterToNum x == 0

The prop_sumValidChars property uses information about the sum of all characters in the list of validChars to ensure that the list and the convertLetterToNum func-tion are correct. The prop_onlyHandlesValidChars funcfunc-tion checks that all none valid characters are converted into a value of 0. These two property kill mutations to the list of valid characters and the convertLetterToNum function.

Killed: 392/398 (98%)

At this point we should start looking at the generated mutants. One of them is a mutation of the toDigits function, which applies not to its guard clause. This makes it apparent that there is no test written for the function. The prop_toDigits

(50)

function ensures that toDigit properly converts an int string back to its int form (Listing 5.15).

1 {-# ANN prop_toDigits "Test" #-}

2 prop_toDigits :: Positive Int -> Bool

3 prop_toDigits (Positive x) = (fromDigits.toDigits) (show x) == x'

4 where x' = fromIntegral x

Listing 5.15: Property testing the toDigit function

The test manages to cover all mutations of toDigits and increases the kill percent-age to 98% leaving only three mutants. Two of the mutants are transformations of the mod9710 function. The properties in Listing 5.16 successfully kill the muta-tions by ensuring that the == cannot be swapped with another comparison function and the literal 1 cannot be mutated.

1 --Original function

2 mod9710 :: Integer -> Bool

3 mod9710 x = n == 1

4 where n = mod x 97

5 6

7 --Properties

8 prop_mod9710' x = (not.mod9710) (x*97)

9

10 prop_mod9710'' (Positive x) (Positive y) = (y /= 1) && ( 0 /= mod y 97) ==> (not.mod9710) (x*97+y)

Listing 5.16:Property testing mod9710

The last surviving mutant is one where the argument patterns for toDigit function have been reordered. This results in it being equivalent due to the patterns not overlapping.

5.4.3 Mutant Type Distribution

Of all of the case studies, this program generated the most mutants. A great deal of the mutants generated were mutation of literal values (Table 7).

(51)

Tab. 7: (IBAN: Default MuCheck Mutation Operators)

Value Mutation 230 57.4%

Pattern Matching 3 0.7%

The proposed mutation operators only managed to generate one function binding mutation and a total of 33 list literal mutations (Table 8).

Tab. 8: (IBAN: Proposed Mutation Operators)

Function Bindings 1 2.9%

List literal mutation 33 97.1%

5.5 Credit card validation

5.5.1 Description

The program makes use of the Luhn algorithm to validate whether or not a credit card number is valid. The programs uses HUnit tests in its specification. It is a pos-sible solution to an exercise from Brent Yorgey’s course, “Introduction to Haskell”.

1 lastDigit :: Integer -> Integer

2

3 dropLastDigit :: Integer -> Integer

4

5 toRevDigits :: Integer -> [Integer]

6

7 doubleEveryOther :: [Integer] -> [Integer]

Mutation Testing in Functional Languages

Mutation Testing in Functional Languages

Master’s Thesis 2016

Joël James Bartholomew

Master’s Thesis 2016

Mutation Testing in Functional Languages

submitted in fulfillment of the requirements for the degree of

Master Software Engineering

JOËL JAMES BARTHOLOMEW

Abstract

Acknowledgements

Contents

Introduction

Research questions

Research method

1

Summary of chapters

1

Related work

2

Background

2.1 Introduction

2.2 Functional programming

2.2.1 Haskell

2.2.2 Haskell Test-frameworks

2.2.2.1 HUnit

2.2.2.2 QuickCheck

2.2.2.3 Hspec

2.3 Establishing the quality of test-suites

2.3.1 Test Coverage

2.3.2 Mutation Testing

2.4 Application of Mutation Testing to functional

programming

3

Mutation Operators

3.1 Introduction

3.2 Mutation Operators supported by MuCheck

3.2.1 NIF: The negation of if-statements

3.2.2 NPG: The negation of pattern guards

3.2.3 RDPM: Reordering for pattern matching and deletion

of patterns

3.2.4 TFR: Type-aware function replacement

3.2.5 MPV: Mutating primitive literal values such as

Integer, String, and Bool

3.3 Extensions to MuCheck

3.3.1 UEDDN: Unit expression deletion in code written

using do-notation

3.3.2 DEDDN: Deleting expressions in do-notation whose

result is discarded.

3.3.3 RLV: The reordering of elements in list literals

3.3.4 DLV: Deleting elements in list literals

3.3.5 FLLV: Replacing list literals with their first and last

elements

3.3.6 IOF: The introduction of new function binding that

are misspelled version of existing functions.

3.3.7 RCE: Reordering of case expressions

3.3.8 DCE: Deletion of case patterns

3.3.9 MBO: Mutation of binary boolean operations

3.3.10 MNA: Mutating not applications

3.4 Conclusion

4

MuCheck Technical Details

4.1 Introduction

4.2 Mutant generation

4.3 Running tests

4.4 Implementing Mutation Operators

4.5 Conclusion

5

Case Studies

5.1 Introduction

5.2 Triangles

5.2.1 Description

5.2.2 Analysis

5.2.3 Mutant Type Distribution

5.3 The Knight’s Tour

5.3.1 Description

5.3.2 Analysis

5.3.3 Mutant Type Distribution

5.4 IBAN validation

5.4.1 Description