(Last modified Tue Jun 03 14:17 2008)

home teaching course schedule site index

In4matx 115
Software Specification and
Quality Engineering
Spring 2008
Selection criteria for testing

Graph-based criteria

Graph coverage of code flowgraphs

Figure 1.  Control flow graph for m(x,y)

Figure 2.  Control and data flow graph for m(x,y)

There are many graph-based testing criteria:  some are listed below.  The criteria are easiest to understand in terms of programs and these two graphs: 

control flow graph
A graph whose nodes are the statements and whose edges show how the program can move from statement to statement. 
data flow graph
A graph whose nodes are the defs (a statement that gives a variable a value, i.e. a "definition") and uses (a statement that uses the value of a variable), and whose edges connect each def to every use that could be affected by that def. 

Uses include appearances in the RHS of an assignment (including "invisible" ones like x += 1 which is both a def and a use of x because it means x = x+1), as actual parameters of calls, as return values from a block, and in a branch or loop condition. 

Every control-flow path

is an edge in the data flow graph. 

However, they can be applied to any graph describing any system artifact, for example specifications, designs, and requirements (see below). 

Here is an example program (that does nothing in particular) whose control flow and data flow graphs are shown in the figures. 

 1  int m(int x, int y) {
 2    while (x > 10) {
 3      x -= 10;
 4      if (x == 10) {
 5        break;
 6      }
 7    }
 8    x = square(x);
 9    if (y < 20 && x%2 == 0) {
10      y += 20;
11    }
12    else {
13      y -= 20;
14    }
15    return 2*x + y;
16  }

We list paths through the program using only the bold statement numbers;  we don't list 6, 7, 11, 14, or 16 because those lines contain no def or use. 

We list the line with the loop condition every time the condition is checked: 

Test set for All-statements coverage: 
  (a) x=20, y=10:  1,2,3,4,5,8,9,10,15. 
  (b) x=20, y=30:  1,2,3,4,5,8,9,13,15. 

Test set for All-edges coverage: 
  (a) x=20, y=10:  1,2,3,4,5,8,9,10,15. 
  (b) x=15, y=30:  1,2,3,4,2,8,9,13,15. 

(Infinite) test set for All-paths coverage: 

(a)x=4 y=101,2,8,9,10,15
(b)x=5 y=101,2,8,9,13,15
(c)x=14y=101,2,3,4,2,8,9,10,15
(d)x=15y=101,2,3,4,2,8,9,13,15
(e)x=20y=101,2,3,4,5,8,9,10,15
(f)x=20y=201,2,3,4,5,8,9,13,15
...
selection criteria hierarchy

Figure 3.  Subsumption among graph coverage criteria

The most commonly encountered criteria are: 

All-Statements (= All-Nodes)
Every node in the (control-flow) graph is executed by at least one test case. 
All-Edges
Every edge in the (control-flow) graph is traversed by at least one test case. 
All-Paths
Every path through the (control-flow) graph is traversed by at least one test case. 

All those are control flow criteria.  The most commonly encountered data flow criteria are: 

All-Defs
A path from each used definition in the data-flow graph to one use of that definition is executed by at least one test case. 
All-Uses
A path from each used definition in the data-flow graph to each use of that definition is executed by at least one test case. 
All-du-Paths
Every path from each used definition in the data-flow graph to each use of that definition is executed by at least one test case. 

Researchers have proved that the various criteria are partially ordered, from the strongest (all-paths) which subsumes all the others, through criteria of intermediate strength (such as all-edges which subsumes all-branches and all-statements), to the weakest (all-statements).  For example, any test set that satisfies all-branches also will satisfy all-statements; any test set satisfying all-edges also satisfies all-branches (and thus all-statements). 

There are programs for which one or more of these criteria cannot be strictly satisfied.  For example, any program with unreachable code can't be covered using any of the criteria above in the strictest sense.  Many programs with loops can't be strictly all-paths covered by any finite test set.  More subtly, some programs have dependencies between the conditions for two or more if statements or loops, so that although it looks like one could take the program through all paths through the branches, the dependencies make it impossible to execute some paths, some edges in the graph cannot be taken, and even some statements are unreachable.  Here is a simple example for which statements 4 and 7 are unreachable: 

1   void bad(boolean b) {
2     if (b) {
3       if (!b) {
4         b = !b;  // unreachable
5       }
6       while (!b) {
7         b = !b;  // also unreachable
8       }
9   }

Consequently, the criteria are usually used to mean only the reachable items (paths, edges, nodes). 

Although it is delightful to have these criteria and understand how they are related, they are challenging to work with manually for systems of realistic size, and some kinds automated support is problematic.  For example, deriving a set of test cases that achieve all-paths coverage of an arbitrary program is equivalent to solving the halting problem for that program.  However, it is much easier to verify coverage than to derive test cases to achieve coverage, and there are software tools to calculate or monitor the coverage of specific test sets. 

Graph coverage at the design element level

Call graph

Figure 4.  Call graph

Figure 4 shows a call graph for classes or methods A through F (the use of the graph is the same for either case).  For classes, this graph is also the USES graph.  We can cover such graphs using control-flow criteria like we did for program control-flow graphs. 

The call graph of a procedural program is connected (unless there are functions that are never called).  In contrast, the call graph of a single OO class is typically disconnected, since many classes' methods are designed to be used by other classes rather than the class containing them.  OO call graph coverage is more useful if all the classes of a program are considered, rather than only a single class. 

Call graph

Figure 5.  Covering an OO call graph

Inheritance and polymorphism complicate call and USES graph coverage.  Figure 5 shows a simple call graph in which a method in class A calls two methods in class C;  in A, objects of type C may be either objects of class C, of subclass C1, or of subclass C2.  The full call graph involves calls to all the m1() and m2() methods:  the call in A to m1() may be to C's, C1's, or C2's m1(), and the call in A to m2() may be to C's or C2's m2() (C1 does not redefine m2()). 

We can also examine call and USES graphs combining two or more modules or classes using data-flow criteria and last defs and first uses.  Here we examine the code preceding a call to identify the last defs of each of the call's actual parameters, and the code of the method or function to identify the first uses of each formal parameter.  A def of x is a last def before method M if there is a def-clear(x) path from the def to the call of M using x.  Similarly, a use of x is a first use in (after) method M if there is a def-clear(x) and use-clear(x) path from the beginning of M (from the call of M) to the use.  The binding of the actual parameter to the formal parameter is not counted as a first use.  b

 1  int m(int x, int y) {   //  A last def of x before line 8
 2    while (x > 10) {      //  First use of x in m(x,y)
 3      x -= 10;            //  The other last def of x before line 8
 4      if (x == 10) {
 5        break;
 6      }
 7    }
 8    x = square(x);
 9    if (y < 20 && x%2 == 0) {  //  A first use of x after line 8
10      y += 20;
11    }
12    else {
13      y -= 20;
14    }
15    return 2*x + y;            //  The other first use of x after line 8
16  }

In the example program m(x.y) above (presented a second time here), line 1 of that program is not a first use of x because that is the binding of the actual parameter to x.  The first use of x is on line 2, in the while condition.  m(x,y) has only one first use of x because all later uses of x are on paths that include line 2, and thus are not use-clear(x). 

The last defs of x before the call square(x) in line 8 are lines 1 and 3.  There are two last defs of x before the call to square(x) because there is a def-free(x) path from each of them to the call. 

The first uses of x after it receives square(x)'s return value are on lines 9 and 15.  There are two first uses because && is defined to be minimal (the second operand is not evaluated if the first operand is false) in Java, C, and many other languages:  x%2 == 0 in line 9 is only evaluated if y < 10 is true.  If y < 10 is false then y < 20 && x%2 == 0 is false regardless of x%2 == 0, in all programming languages with which I am familiar. 

Covering logic formulae

selection criteria hierarchy

Figure 6.  Subsumption among logic coverage criteria

Here we will look at logic formulae as they can be constructed in a programming language statement.  The language is a predicate logic, i.e. first-order logic without the quantifiers ∀ and ∃, in which a formula in the logic is termed a predicate and what would be a predicate in FOL is termed a function.  A predicate can be: 

There are coverage criteria for programs based on the predicates in the programs. 

Predicate coverage
For every predicate P in the program, there is a test case that makes P false, and a test case that makes P true. 
Clause coverage
For every clause C in a predicate in the program, there is a test case that makes C false, and a test case that makes C true. 
Combinatorial coverage
For every predicate P in the program, there is a test case that produces every combination of true and false for the clauses in P. 

The next level is to consider which clauses can control the value of the predicate and uner what circumstances.  Let C be a clause in predicate P.  If there is a combination of true and false values for all the other clauses in P, such that changing C's truth value results in changing P's truth value, then C is a major clause of P, the other clauses are the corresponding minor clauses, and for those combinations of truth values for the minor clauses, C determines P. 

Active clause coverage (ACC)
For every major clause C of every predicate P, there are two test cases that gives the minor clauses a combination of values that lets C determine P:  one case for which C is true, and another for which C is false. 

Active clause coverage is closely related to the MC/DC (modified condition/decision coverage) criterion, but is defined without ambiguity.  There are three "flavors" of active clause coverage, depending on what is required to be the same for the two cases: 

General active clause coverage
ACC, allowing the minor clauses to have different values for the two test cases and allowing P to have different values or the same value for the two test cases. 
Correlated active clause coverage
ACC, allowing the minor clauses to have different values for the two test cases, and requiring P to be true for one test case and false for the other. 
Restricted active clause coverage
ACC, with each minor clause having the same value for the two test cases, and requiring P to be true for one test case and false for the other. 

In the example program m(x,y), there are two predicates

and three clauses

A test set providing predicate coverage (but not clause coverage): 

xyx>10y<20&&x%2==0
820FF
1219TT

A test set providing clause coverage (but not combinatorial coverage): 

xyx>10y<20x%2==0
920FFF
1219TTT

A test set providing general active clause coverage and correlated active clause coverage (I don't believe this program has a test set providing only general active clause coverage): 

xyx>10y<20x%2==0y<20&&x%2==0
920FFF F x>10 major
1220TFT F y<20 major
1119TTF F x%2==0 major
1219TTT T x>10 major, y<20 major, and x%2==0 major

A test set providing restricted active clause coverage: 

xyx>10y<20x%2==0y<20&&x%2==0
819FTT T x>10 major
1119TTF F x%2==0 major
1220TFT F y<20 major
1219TTT T x>10 major, y<20 major, and x%2==0 major

The test set providing combinatorial coverage: 

xyx>10y<20x%2==0
920FFF
820FFT
919FTF
819FTT
1120TFF
1220TFT
1119TTF
1219TTT

Test set providing coverage: 

xyx>10y<20x%2==0y<20&&x%2==0
920FFFF
820FFTF
919FTFF
819FTTT
1120TFFF
1220TFTF
1119TTFF
1219TTTT

Covering partitions of the input space

In this approach, we partition the system's input space into blocks, such that all the inputs in each block are "equally useful" for testing.  Each block is an equivalence class in the partition.  It is necessary that the blocks be disjoint, that is, that every test case falls into a block, and no test case falls into two blocks.  Then the test cases cover the input space if there is at least one test case for each block. 

Input space partitioning is usually done from the system's specification rather than its code. 

Let's consider the substring(int begin, int end) method of java.lang.String

BlockCharacterization
Returns the empty string 0≥begin AND begin+1==end AND end≤length()
Returns a nonempty string0≥begin<end≤length()
Fails (1)end≤begin
Fails (2)begin<0 AND begin<end
Fails (3)length()<end AND 0≤begin<end

Then a testing criterion would be to have a test case for each of the five blocks. 

It would be easy to create a non-partition of the input space, for example

BlockCharacterization
Returns the empty string 0≥begin AND begin+1==end AND end≤length()
Returns a nonempty string0≥begin<end≤length()
Fails (1)end≤begin
Fails (2)begin<0
Fails (3)length()<end

because some test cases fall into more than one block (for example, the case where end≤begin<0, which falls into both Fails(1) and Fails(2)). 

Covering syntax

In testing software whose behavior is specified by a grammar, there are coverage metrics based on the grammar that can be used.  Note that a regular expression corresponds to a grammar, so that these criteria work for regular expressions as well. 

Terminal symbol coverage
Every terminal symbol in the grammar is covered by at least one test case. 
Production coverage
Every production in the grammar is exercised by at least one test case. 
Derivation coverage
Every string that can be derived (produced) with the grammar is exercised by at least one test case.  Since a grammar can (in general) derive an infinite number of strings, this criterion is impractical to use directly. 

There are correspondences between grammars and graphs (FSP is one example):  the obvious general one represents strings of symbols (terminal and nonterminal) by states and productions by transitions between states.  Therefore these syntax coverage criteria correspond to graph coverage criteria:  terminal symbol coverage is analogous to all-nodes, production coverage to all-edges, and derivation coverage to all-paths. 

Sources

Adrion+Branstad+Cherniavsky1982-vvtc  ·
W. Richards Adrion, Martha A. Branstad, and John C. Cherniavsky.  Validation, Verification, and Testing of Computer Software.  ACM Comput. Surv., 14(2):159-192, 1982. 

http://dx.doi.org/10.1145/356876.356879

Amman+Offutt2008-ist
Paul Ammann and Jeff Offutt.  Introduction to Software Testing.  Cambridge University Press, 2008. 
Goodenough+Gerhart1975-tttd-tse
John B. Goodenough and Susan L. Gerhart.  Toward a Theory of Test Data Selection.  IEEE Transactions on Software Engineering, 1(2):156-173, June, 1975. 
Muccini 2002 slides
This handout began from Dr. Henry Muccini's slides for ICS122, 2002 (used with permission). 
Share-Alike Made with jEdit Valid CSS! Valid HTML 4.01! UC Irvine Thomas A. Alspaugh
Assistant Professor, Informatics Dept.
School of Information and Computer Sciences