Open Issues - Eindhoven University of Technology MASTER Model-based testing and data combinatio

One of the alternatives to Equivalence Partitioning is random testing. Test cases are just randomly chosen from from the entire input domain without any criteria.

Random testing proved to be less effective than partition testing methods like Equivalence partitioning [15]. However, at the positive side, it is dynamic in nature while all other combination strategies are static. It is more likely to detect a bug, when test cases are generated multiple times as bugs become immune to static testing methods after some iterations. We elaborate more on random testing, and how it can be customized so that it is as efficient as Equivalence Partitioning and is still dynamic in nature, while making data model for our SUT.

5.2 Open Issues

Most of the combination strategies apply several techniques to reduce the input space of test cases. However, they treat parameters independently of each other.

The test cases generated by these strategies are not sufficient to verify the de-pendencies among parameters. In this section we demonstrate the concept of dependency among parameters with the help of the classical triangle problem.

Suppose we have an application that accepts three input parameters A, B and C as sides of a triangle. On the basis of parameters value the application determines whether the combination of parameters can form a triangle and the category of the triangle formed, i.e.,

– Scalene: all sides of the triangle are not equal.

– Isosceles: two sides of the triangle are equal.

26 Chapter 5. Data Dependency

– Equilateral: all sides are equal.

In order to test such an application, we have to determine the values that param-eters A, B and C can take. If the requirement specification explicitly mentions the range of values that a particular parameter can take, we will take that range for the parameters otherwise, ideally any positive number is a valid value for a parameter (as sides of a triangle cannot be negative).

Now lets have a look at the interdependency among the parameters.

– For being a triangle, it is required that sum of two sides should be greater than the third side, i.e.,

A < B + C and B < A + C and

C < A + B – For being a scalene triangle:

A 6= B and B 6= C and

A 6= C – For being an isosceles triangle:

A = B or B = C or C = A – For being an equilateral triangle:

A = B and B = C

The question that comes next is how to combine and how many combinations are required to reflect all dependencies. The answer lies in one of oldest and most efficient ways to represent and analyze complex logical relationships, i.e., Decision Tables. A decision table is a table consisting of conditions, rules and expected outcomes. More on decision tables can be found in [16].

5.2. Open Issues 27

Next, we construct a decision table for this particular application in hand and gradually prune it. Pruning a decision table is the process of excluding the cases that make no sense for testing. It particularly depends on the application in hand. The initial decision table is shown in Table 5.4. The upper-left quadrant contains the conditions. The upper right quadrant contains the condition rules for alternatives, i.e., for T the condition holds and for F the condition does not hold.

The lower left quadrant contains the actions to be taken as a result of combination (all possible outcomes of the application)and the lower right quadrant contains the action rules which shows the outcome for a specific combination of condition rules.

MinValue and MaxValue represent the range of values of any side of the triangle.

Table 5.4: Initial Decision Table

Each column consisting of a T and F forms a test case. As we can see there are nine conditions, and each condition can either be True or False, 512 possible combinations of parameters are possible. Some of them are feasible while some are not (cases like A = C and A = B and B 6= C, all are true). These combinations can be pruned.

We can reduce the total number of combinations in a number of steps:

– Step 1: If the length of each side (A, B and C) of the triangle is out of its range (say they are negative), the (expected) outcome of the system should always be Value out of range. We assign to one constraint (either c1, c2 or c3) to the value False and to the other two the value True at a time and put don’t care (shown as ‘-’ in the decision table) for the rest of the constraints. If the implementation has not correctly implemented the first three constrains, these three combinations are sufficient to reveal this fact. More on pruning

28 Chapter 5. Data Dependency

of combinations can be found in [16].

As a result the total number of combination comes down to 67 (2⁶+ 3).

– Step 2: The constraints c4, c5 and c6 check whether combinations of sides can form a triangle and constraints c7, c8 and c9 checks for the type of triangle. If either of constrains c4, c5 or c6 does not hold then there is no point in checking constraints c7, c8 and c9. Therefore, we again assign to one constrain (either c4, c5 or c6) the value False and to the other two the value True at a time and just do not care for the rest of the constraints. In this way we again reduce the total number of combination to 14 (2³+ 3 + 3).

Resulting complete table is shown in Table 5.5.

Table 5.5: Final Decision Table

Hence, we have shown, how we can prune the decision table to 1/50 of its original size and we are left with a small but effective number of cases. It satisfies all domains of testing as the original table does. Each column (except the first) of the decision table will eventually become a test case before actual testing activity.

Now, once we have the decision table in place, we have to assign values to the parameters such that it satisfies all the constraints of a column at a time. In this particular example of triangles, when we tried to manually assign values to pa-rameter A, B and C for all columns of the decision table 5.5, it took us several minutes even when the dependencies are in their very primitive nature. However, most practical applications have complex dependencies among parameters. Assign-ing values to parameters for every column of decision table will consume several hours depending on the number of parameters, number of dependencies and the

5.2. Open Issues 29

complexity of dependencies. A graph representing the relationship between the time required to generate test cases and dependencies is shown in Figure 5.3. The graph was obtained experimentally and might slightly differ for different forms of dependencies (simple or complex) in an application.

Figure 5.3: Test case generation time against number of parameters + number of dependencies

Time

Nr. of Parameters + Nr. of dependencies

Decision Tables can efficiently represent dependencies in any form. However, if the values of the parameters have to be assigned manually, this is a tedious task and is often impossible in cases where there are too many dependencies among parameters (in triangle case we have only 6 dependencies among parameters).

In order to exploit the full potential of decision tables, we need to automate the assignment of values to the parameters in such a way that for each column of Decision Table (which represents a test case), we have a satisfying assignment for its parameters. This automation helps to evaluate any number of dependencies among parameters in an application.

Not much work in this area has been done in literature. The only relevant work can be found in [17]. The authors have described a method (CECIL = Cause-Effect Coverage Incorporating Linear boundaries) for generating test data for problems involving complex linear dependencies between two or more variables. The CECIL method was integrated in a test generation tool set IDATG (Integrated Design and Automated Test case Generation) [18]. In [19] authors have used TEMPPO Designer, a model-based test automation tool which integrates IDATG, to model

30 Chapter 5. Data Dependency

the electronic interlocking system for railways. They have used CECIL method to model data. However, the CECIL method described in this work only allows linear dependencies, i.e., 5 * A ≤ B is allowed but C * A ≤ B is not. The dependencies in most of the practical applications including the current SUT is non-linear in nature. Therefore, the proposed method was inadequate to generate test data for our application.

We propose an algorithm that effictively assigns values to parameters for every column of the Decision Table for the current SUT. However, before going into details of the algorithm or giving details of the dependencies among parameters in the SUT, we describe some concepts required to understand the algorithm.

Data dependent equivalence classes

Suppose we have an application with two parameters (say A and B) and one dependency of the form A = B + 10, where A has valid values between 5 and 15 and B has a valid values between 0 and 10, i.e.,

5 ≤ A ≤ 15 0 ≤ B ≤ 10

The decision table for this application is shown in Table 5.6. We deliberately did not write the rules in the table as rules are application specific and are irrelevant in the present context.

Table 5.6: Decision Table

c1: 5 ≤ A ≤ 15 T T T F c2: 0 ≤ B ≤ 10 T T F T c3: A = B + 10 T F -

-Four test cases corresponding to four columns of the decision table have to be gen-erated in order to verify the dependency. For the last two columns of the decision table, we can simply assign values to parameters A and B based on equivalence partitioning as it does not take into account of the dependency. However, in order to assign values to first two columns of the decision table we have to repartition the original equivalence classes of the parameters.

The motivation behind the repartitioning of the original partition of valid values of a variable is to make sure that if we select any value from valid partition of one parameter, we are always left with some valid values of the other parameters that can reflect the dependency among them. We do not lose anything by doing a repartitioning as the new range of valid values remains a subset of the original range of valid values.

5.2. Open Issues 31

Suppose we are constructing test case for the first column of the decision table where all conditions are true. Constructing such a combination leads to the con-clusion that B + 10, should have the same range as that of A because if we select B as 9, which is valid, leads to assigning value to A as 19 which is not a valid value of A. Therefore, the ranges of valid values of A and B have to be redefined, to reflect this kind of dependency.

If we do not repartition the original equivalence classes of parameters and select the values randomly from each equivalence class, then in most cases the combination of values will not reflect the required dependency and in some cases we are even not left with the valid values of parameters that can reflect the dependency. We will show how we can repartition the original equivalence class of the parameters by several examples before going into the actual algorithm.

There can be several categories of data dependency. We will mainly concentrate on two of them.

1. Equality dependency 2. Inequality dependency

Equality dependency

Equality Dependency is a kind of data dependency in which a combination of zero or more parameters have to be equal (or not equal) to another combination zero or more parameters.

Example 1 Consider the following kind of data dependency:

A = B + 10 where, 5 ≤ A ≤ 15 and 0 ≤ B ≤ 10.

In this particular case we have to select a value of A (within its valid range) in such a way that there exists a value for B (within its valid range) that can satisfy the equation. The same holds for the other way around (if one selects the value of B initially). The minimum value of A derived from the equation is 10 ((minimum value of B) + 10). Similarly its maximum value is 20. However, original range of valid values for A is in between 5 and 15. If we intersect the two ranges, we get the new range of valid values for the parameter A (10 ≤ A ≤ 15). We follow the same procedure for calculating the new range of valid values for the parameter B. The invariant used in this process is that the left hand side of the equation (A) should have the same range of valid values as its right hand side (B + 10) and vice-versa.

32 Chapter 5. Data Dependency

The reconstruction of the range of valid values of A results in:

10 ≤ A ≤ 15 The range of valid values of B is redefined as:

0 ≤ B ≤ 5

The new ranges for the parameters satisfy the motivation behind the repartitioning, i.e., if we select any value from the valid partition of one parameter, we are always left with some valid values of the other parameters. In case one of the valid ranges (for any parameter) turns out to be empty, then the dependency will not hold in any scenario. A generalized mechanism to repartition the original range of valid values for the parameters is given in Algorithm 1.

Example 2 Consider the following type of data dependency:

A = B + C

where A and B have the same range of valid values as in Example 1 and C has a range of valid values as:

10 ≤ C ≤ 20

The data dependency requires that the range of valid values of A has to be equal to the range of valid values of B + C, leading to the reconstruction of equivalence classess of A, B and C as:

10 ≤ A ≤ 15 0 ≤ B ≤ 5 10 ≤ C ≤ 15

If one selects the value of parameter A as 8 (within its original range), then he is not left with any value of the parameter B and the parameter C that can verify the validity of the equation A = B + C (satisfy the equation).

Example 3 Consider the following kind of data dependency:

A + B = C + D

where, A, B and C have the same range of valid values and D has a range of valid values as:

7 ≤ D ≤ 12

5.2. Open Issues 33

Now it requires that A + B should have the same range of valid values as C + D, i.e., between 17 and 32 and vice-versa. Leading to the reconstruction of the valid equivalence class of A, B, C and D as:

7 ≤ A ≤ 15 2 ≤ B ≤ 10 10 ≤ C ≤ 18

7 ≤ D ≤ 12

This roughly covers the types of dependencies that may occur in any application domain. There can be two major variations that can take place, i.e., there can be other types of mathematical operators that may occur instead of ‘+’ and can give rise to a more complex algebraic structure. The other type of variation is with respect to the dataset of variables. In the above examples we have taken a dataset of variables as range. Other forms of input can be a number (Constant), set of valid values, and Booleans.

We can treat a constant as a range with the same minimum and maximum values as the constant itself. A set of valid values as range with the minimum value as the value of minimum constant in that set and maximum value of the range as the value of maximum constant in that set. If the set of valid values contains strings then there is no need to apply the algorithm as none of algebraic operations can be performed on strings. For booleans we do not apply the algorithm either.

Inequality dependency

Inequality dependency is a kind of data dependency in which algebraic combination of zero or more parameters has to be greater than, less than, greater than equal to or less than equal to the algebraic combination of zero or more parameters.

Example 4 Consider the following form of data dependency:

A ≤ B + 10 where,

5 ≤ A ≤ 15 0 ≤ B ≤ 10

In this particular example A has to be less than equal to B + 10 and B + 10 has to be greater than A, giving rise to the following range of valid values:

0 ≤ A ≤ 15 0 ≤ B ≤ 10

34 Chapter 5. Data Dependency

Example 5 Consider the following form of data dependency:

A ≥ B + C where,

5 ≤ A ≤ 15 0 ≤ B ≤ 10 10 ≤ C ≤ 20

Proceeding from the Algorithm 1 give rise to the following redefined range of valid values.

10 ≤ A ≤ 15 0 ≤ B ≤ 5 10 ≤ C ≤ 15

Example 6 Consider the following form of data dependency:

A + B ≥ C + D

Again proceeding from the Algorithm 1, we obtain the following redefined range of valid values:

7 ≤ A ≤ 15 2 ≤ B ≤ 10 10 ≤ C ≤ 18

7 ≤ D ≤ 12

Next we show how we repartitioned the equivalence classes of Example 3 using the Algorithm 1:

Equation E is A + B = C + D.

MinA = 5.

MaxA = 15.

MinB = 0.

5.2. Open Issues 35

Variables (with ranges) + Dependencies (equations)

Split equation into LHS and RHS

Define range for LHS and RHS

Assign new range to every variable

Intersect them with the original range

Figure 5.4: Flow Diagram of Algorithm 1

36 Chapter 5. Data Dependency

Algorithm 1 Data Dependent Equivalence Class Input: Variables with their range

Input: Dependency in form of equations

1: procedure Data Dependent Equivalence Class

2: for all Equations E do

3: for all Variable V in E do

4: MinV = Minimum value of V in its defined range.

5: MaxV = Maximum value of V in its defined range.

6: end for

7: Split E into LHS and RHS

8: if V preceded by + or * or any increasing sequence then

9: LHSmin = Substitute Minimum value of each variable in LHS

10: LHS_max = Substitute Maximum value of each variable in LHS

11: RHS_min = Substitute Minimum value of each variable in RHS

12: RHSmax = Substitute Maximum value of each variable in RHS

13: else

14: LHS_min = Substitute Maximum value of each variable in LHS

15: LHSmax = Substitute Minimum value of each variable in LHS

16: RHS_min = Substitute Maximum value of each variable in RHS

17: RHS_max = Substitute Minimum value of each variable in RHS

18: end if

. LHS_min and LHS_max become the new range of the corresponding RHS and vice-versa

19: LHSmin≤ RHS ≤ LHS_max

20: RHSmin ≤ LHS ≤ RHS_max

21: for all Variable V in LHS do

22: for all Variable V1 6= V do

23: if V1 preceded by + or * or any increasing sequence then

24: CorrectedMinV = Substitute Max value of V1 in RHS_min= LHS

25: CorrectedMaxV = Substitute Min value of V1 in RHSmax= LHS

26: else

27: CorrectedMinV = Substitute Min value of V1 in RHS_min= LHS

28: CorrectedMaxV = Substitute Max value of V1 in RHSmax= LHS

29: end if

30: end for

31: end for

32: for all Variable V in RHS do

33: for all Variable V1 6= V do

34: if V1 preceded by + or * or any increasing sequence then

35: CorrectedMinV = Substitute Max value of V1 in LHSmin= LHS

36: CorrectedMaxV = Substitute Min value of V1 in LHS_max= LHS

37: else

38: CorrectedMinV = Substitute Min value of V1 in LHSmin = LHS

39: CorrectedMaxV = Substitute Max value of V1 in LHS_max= LHS

40: end if

41: end for

42: end for . two range of same variable V is R1 and R2

5.2. Open Issues 37

43: R3 = R1 ∩ R2.

44: if R3 == NULL then

45: V does not have any valid value

46: else

Defining the range for RHS and LHS

RHS_min≤ LHS ≤ RHS_max, i.e., 17 ≤ A + B ≤ 32

To get the final range, we intersect the original range and the new range of each variable:

MinA = 7.

MaxA = 15.

MinB = 2.

38 Chapter 5. Data Dependency

However, the pseudocode of the inequality dependency as given by examples 4, 5 and 6 is not included as a part of Algorithm 1 to make it simple and readable. In case of such a dependency lines 19 and 20 of Algorithm 1 which typically represents equality dependency have to be extended with the following pseudocode:

LHS_min≤ LHS ≤ RHS_max LHSmin ≤ RHS ≤ RHS_max

If the dependency equation contains ≥, the following pseudocode represents it:

RHS_min ≤ LHS ≤ LHS_max RHSmin ≤ RHS ≤ LHS_max

Remark: Algorithm 1 just provides an initial domain of values from which the parameters should be valuated in order to reflect the dependency. While the actual assignment of values to the parameters to construct test cases is a different issue.

The algorithm may be executed several time during the process of assignment of actual values parameters. We explain the overall working of the process with the help of dependencies among parameters in the current SUT.

Figure 5.5 represents the geometrical solution of the Algorithm 1 for the Example 1. The dependency equation, i.e.,

A = B + 10

has the solution on the line segment A = B + 10. The initial range of parameters A and B are shown by brown and blue color strips respectively. The dependencies among parameters can only be reflected if values of parameters A and B are selected from an area where the brown strip, the blue strip and the line segment A = B + 10 intersects. This is shown by a red color segment in Figure 5.5. Any other value

In document Eindhoven University of Technology MASTER Model-based testing and data combination Vishal, V. (pagina 34-49)