Flow Diagram Decomposition Using Graph Transformations

(1)

Flow Diagram Decomposition

Using Graph Transformations

Arend Rensink and Maria Zimakova

August 20, 2009

(2)

Content

Abstract... 3

Chapter 1. Introduction... 4

Chapter 2. Basic Notions... 5

Chapter 3. Flow Diagram Decomposition... 13

3.1. Base subdiagram decomposition ... 13

3.2. SCC Decomposition ... 16

3.3. Complexity measuring... 17

Chapter 4. Graph Transformations: Implementation within GROOVE... 19

4.1. Graph transformations ... 19

4.2. Implementation within GROOVE... 20

Chapter 5. Conclusions and Related Work... 24

Bibliography ... 25

Appendix A. GROOVE Rule Description... 27

(3)

Abstract

The key challenge of model transformations in model-driven development is in transforming higher-level abstract models into more concrete ones that can be used to generate implementation level models, including executable business process representations and program code. Many of the modelling languages (like UML Activity Diagrams or BPMN) use unstructured flow graphs to describe the operation sequence of a business process. If a structured language is chosen as the executable representation, it is difficult to compile the unstructured flows into structured statements. Even if a target language structure contains goto-like statements it is often simpler and more efficient to deal with programs that have structured control flow to make the executable representation more understandable.

In this paper, we take a first step towards an implementation of existing decomposition methods using graph transformations, and we evaluate their effectiveness with a view to readability and essential complexity measures.

(4)

Chapter 1.

Introduction

Over the last few years, a new option has evolved to define solutions in software industry: Model-Driven Development (MDD). The key challenge of model transformations in MDD is in transforming higher-level abstract models into more concrete ones that can be used to generate implementation level models, including executable business process representations and program code. With this trend, the decomposition of the models into structured elements is of increasing importance.

In the large, a number of motivations can be given to justify the implementation of this work:

Imagine a dynamic behaviour of business process is described as an unstructured flow graph (which can represent, by-turn, a UML activity or BPMN diagrams). If a structured language is chosen as the target executable representation, it is difficult to transform the unstructured flows into structured statements. This problem is analyzed, for instance, in [EKR08] and attempts to compile UMLA to BPEL programs; the last issue is discussed, for instance, in [ZHB05]. The main task of our graph transformations is to translate the unstructured goto-like statements into well-structured statements in the target language.

The second very important reason for the presented work is to improve software reliability and readability – making programs less error prone and easier to understand. Because understanding of behavior is an essential prerequisite to effective program development and modification, programmers are forced to devote substantial time to this task [CV06].

There exist today a number of variants on the idea of well-structured models. A lot of restructuring methods were done in the context of flow diagram decomposition. It is commonly agreed that a natural interpretation of flow diagrams is in terms of graphs – essentially, just nodes with connecting edges. Consequently, a most natural implementation of flow diagram decomposition methods is by graph transformations.

The aim of this work is to bridge the gap between formalism of the existing flow diagram decomposition methods and practical implementation in terms of graph transformations to use it for modern programming environments including executable business process languages.

The remainder of this paper is structured as follows: after providing the basic definitions to set the stage in Chapter 2, we discuss the flow graph decompositions and complexity measure problem in Chapter 3. We consider these to be the heart of our contribution. In Chapter 4 we implement those methods with graph transformations, employing the graph-transformation tool Groove [Ren04] for rule execution. Finally, in the conclusion (Chapter 5) we come back to the above considerations, evaluate our results and discuss plans for future work.

(5)

Chapter 2.

Basic Notions

Graphs and flow graphs. One of the core concepts of this paper is that of graphs. We

assume a countable universe Λ of labels. We start by repeating the usual definition of a graph.

Definition 1 (labeled directed graph) A labeled directed graph is a tuple G = (N, E, λ) where

• N is a finite nonempty set called a set of nodes; • E ⊆ N × Λ × N is a set of edges;

• λ is a labeling function λ: N ∪ E → Λ.

Each edge in a directed graph is a triple (v, a, w). We say the edge leaves v and enters w, v is a predecessor of w, and w is a successor of v, v is a source node and w is a target node of the edge. Given e = (v, a, w) ∈ E, we denote src(e) = v, tgt(e) = w and a = λ(e) for its source, target and label, respectively. An edge (v, a, v) is a loop. A labeled directed graph G1 = (N1, E1, λ1) is a subgraph of a graph G2 = (N2, E2, λ2) if N1 ⊆ N2, E1 ⊆

E2, Λ1 ⊆ Λ2 and λ1 ≤ λ2 in the sense of following

⎩ ⎨ ⎧ Λ ∈ ∪ ∈ Λ ∈ λ = λ ∪ ∈ ∀ . , ; , ) ( ) ( 2 1 1 1 1 2 2 2 _l _otherwise E N x x x E N x

Definition 2 (path) A path in a graph G is an alternating sequence of nodes and edges

represented as {v1, e1,v2,e2,…} beginning and ending with nodes such that for each i ≥ 1

we have vi ∈ N, ei ∈ E, src(ei) = vi and tgt(ei) = vi+1.

There is a path of no edges from any node to itself. A node w is reachable from a node v if there is a path from v to w. The path length |p| of (v1, …, vk) is k – 1. The concatenation of two paths p, q is denoted by pq, where we require p to be finite and end at the initial node of q [Har89].

Let G be a labeled directed graph as above with a labelling function λ: N ∪ E → Λ, then a path p = {v1, e1,v2,e2,…, vk−1, ek−1,vk} in G can be represented by the word from the

alphabet Λ as follows:

λ(p) = λ(v1)λ(e1)λ(v2) ... λ(vk−1)λ(ek−1)λ(vk). We call this the word representation of p.

Let us imagine now that given directed graph G, there is the equivalent undirected graph G′ such as NG = NG′ and EG′ was obtained from EG by replacing all of directed edges in G with undirected edges. In undirected graph G′ two nodes are called connected if there is a path between them. An undirected graph G′ is called connected if every pair of distinct nodes in the graph G′ can be connected through some path. A connected component in an undirected graph G′ is a maximal connected subgraph.

Definition 3 (connected graph) A directed graph G is called weakly connected, or just

connected, if replacing all of its directed edges with undirected edges produces a connected undirected graph G′.

(6)

Figure 1: Flow graph example

An undirected tree is a connected graph with no cycles. A directed tree is a directed graph which would be a tree if the directions on the edges were ignored. A tree is called a rooted tree if one node has been designated the root, in which case the edges have a natural orientation, towards or away from the root.

A directed graph is called strongly connected if there is a path from each node in the graph to every other node.

Definition 4 (strongly connected component) The strongly connected components (SCC) of a directed graph are its maximal strongly connected subgraphs.

Now let us define the flow graph as follows.

Definition 5 (flow graph) A flow graph Φ is a triple (G, s, t), where • G = (N, E, λ) is a (weakly) connected labeled directed graph;

• Node s ∈ N is the unique start node such that there are no incoming edges to s in G.

• Node t ∈ N is the unique terminal node such that there are no outgoing edges to t in G.

A flow graph is empty, denoted ε, if N = {s, t} and E = {e} where e = (s, l, t), l ∈ Λ is a distinguished label of e. A flow graph is an elementary a-labelled flow graph if N = {s, v, t} and E = {e1, e2} where e1 = (s, l, v), e2 = (v, l, t) and λ(v) = a.

Figure 1 shows the simple example of a flow graph graphical representation, which will be used throughout this paper, because it contains most of the features needed to explain the transformation algorithms. In this example

N = {vs, va, va1, va2, vb, vc, vd, vd1, vd2, vt},

E = {(vs, next, va), (va, true, va1), (va, false, va2), ..., (vd2, next, vt)}, Λ = {s, a, a1, a2, b, c, d, d1, d2, t, next, true, false},

vs with label s is the starting node and vt with label t is the terminal node.

Further, to ease identification, we will identify nodes by their labels when it is possible (when a bijection between nodes and node labels exists).

According to the above definitions the flow graph shown in Figure 1 is connected and has 7 strongly connected components: {s}, {a}, {a1}, {a2}, {b, c, d, d1}, {d2}, {t}.

There are two most common types of nodes in a flow graph:

• The functional type (function) which represent some operations (semantically described by label λ(n)) to be carried out on an object v ∈ N (nodes a1, a2, b, c, d1,

d2 in our example).

• The predicative type (predicate) which do not operate on an object but decide on the next operation to be carried out, according to whether or not a certain property of v ∈ N holds (nodes a and d in our example).

(7)

There are two usual representations of node types in flow graphs: functions are represented by rectangular boxes and predicates are represented by diamond-shaped boxes. But because of our graph tool features we can use only rectangular boxes for our representations.

Therefore in this paper we distinguish functional and predicative node types by count of their leaving edges as follows:

• the functional box can has just only one leaving edge (with next label for our example in Figure 1) and

• the predicative box can has just only two leaving edges (with true and false labels for our example in Figure 1).

The different node types that are supported by flow graphs, together with their relationships, are shown in Figure 2, where we appeal to the reader’s intuition about the meanings of this graph.

Figure 2: The types in the flow

diagram

Let Φ = (G, s, t) be a flow graph and p be a some path in G from the start node s to the terminal node t. Then we will say that p is a full path in the flow graph Φ.

Definition 6 (execution sequence) Let p be a full path in a flow graph Φ = (G, s, t). Then an execution sequence λ(p) is the word representation of p:

λ(p) = λ(s)λ(e0)λ(v1)λ(e1)λ(v2) ... λ(vk)λ(ek)λ(t).

For instance, sequences (s next a true a1 next b next d false d2 next t) and (s next a

false a2 next c next b next d true d1 next c next b next d false d2 next t) are two execution

sequences for our example in Figure 1.

Let Path be a set (may be infinite) of all possible full paths in the flow graph Φ = (G, s, t). The word representation of Path thus regarded as a formal language (or just language) Lang = λ(Path) defined over the alphabet Λ by flow graph Φ.

Definition 7 (equivalence of flow graphs) Two flow graphs Φ1 and Φ2 are equivalent

(denote it as Φ1 ∼ Φ2) if they define the same languages.

For instance, flow graphs Ξ and Ξ′ shown in Figure 5 (a) and Figure 5 (b)

respectively are equivalent. Indeed, they define the same languages Lang(Ξ) = Lang(Ξ′) = {a next α (false a next α)* true} where “*” denotes matching of the preceding element zero or more times.

Algebra of flow diagrams. For our subsequent definitions we also need the formal

concept of an algebra, which we likewise repeat here, for the sake of completeness and to introduce the notations [EEP06].

Definition 8 (signatures and algebras) An algebraic signature, or signature for short, is

a tuple Sig = (Sort, Oper, par) where • Sort is a set of sorts;

• Oper is a set of operation symbols;

• par: Oper → Sort+_{is a mapping that associates to every operation op ∈ Oper a}

non-empty string of sorts par(op) = ϕ0 … ϕn for n ≥ 0, of which the elements ϕ0 …

(8)

(a) Π(a, b) (b) Δ(α, a, b) (c) Ω(α, a)

(d) Seq construction (e) IfThenElse construction (f) While construction

Figure 3: Diagrams of Π(a, b), Δ(α, a, b), Ω(α, a) and respective syntax tree constructions

sort.

A Sig-algebra is a tuple A = (Data, Func) where

• Data is a set of disjoint carrier sets, one per sort ϕ ∈ Sort, denoted Dϕ;

• Func is a set of functions, one per operation op ∈ Oper, denoted fop;

• The type of the functions in Func should be consistent with par, in the sense that fop: D_ϕ0 × ... × D_ϕn–1 → D_ϕn whenever par(op) = ϕ0 … ϕn.

Note that an operation op with no parameters represents a constant value, which in a Sig-algebra is given by fop().

A flow diagram is a graphical representation of the flow graph which is suitable for representing programs, Turing machines, etc. Diagrams are usually composed of boxes connected by directed lines.

Following [BJ66], we can distinguish three elementary types of flow diagrams Π, Δ and Ω which denote, respectively, the diagrams of Figure 3 (a)-(c) and the constructions ‘sequence’, ‘if-then-else’ and ‘while’ in programming languages. Let us call these tree elementary types of flow diagrams Γ = {Π, Δ, Ω} base subdiagrams.

Let us assume a universe Θ of arbitrary flow graphs, a set θfunc ⊂ Λ of all functional node labels and a set θpred ⊂ Λ of all predicative node labels.

Definition 9 (flow graph substitution) Let Φ = (G, s, t) be an arbitrary flow graph where G = (N, E, λ) and N' = N \ {s, t}. A flow graph substitution is a mapping Sub: N' → Θ that maps each node v ∈ N' to a flow graph Φv = (Gv, sv, tv) where Gv = (Nv, Ev, λv), and obeys the following rules:

• Φ[Φv / v] = (GSub, sSub, tSub) is a flow graph, GSub = (NSub, ESub, λSub);

• sSub = s and tSub = t;

• NSub = (N \ v) ∪ (Nv \ {sv, tv}) ;

• ESub = (E \ EDel) ∪ (Ev \ (EvDel) ∪ (EsIns ∪ EtIns) where o EDel = {e ∈ E: src(e) = v or tgt(e) = v},

o EvDel = {ev ∈ Ev: src(ev) = sv or tgt(ev) = tv},

o EsIns = {eSub ∈ ESub | ∃ ev ∈ Ev: src(ev) = sv, tgt(ev) = tgt(eSub), λ(ev) = λ(eSub);

(9)

o EtIns = {eSub ∈ ESub | ∃ ev ∈ Ev: src(ev) = src(eSub), tgt(ev) = tv, λ(ev) = λ(eSub);

∃ e ∈ E: src(e) = v, tgt(e) = tgt(eSub)}.

A substitution Sub can be extended to the whole flow graph as Φ[Sub] = Φ [Φ_v1/ v1] [Φ_v2/ v2] … [Φ_vn/ vn].

Let us define the signature Sig = (Sort, Oper, par) for the flow graphs. We have a sort fg, representing the arbitrary flow graphs, a sort pred, representing the predicative nodes, and a sort func, representing the functional nodes. We also define a constant empty for the empty flow graph and operation symbols for the elementary flow graphs (for each functional node) and our base subdiagrams Γ = {Π, Δ, Ω}:

Sig =

Sort: fg, pred, func;

Oper: empty, elem, Π, Δ, Ω; par: empty:→ fg,

elem: func → fg, Π: fg fg → fg, Δ: pred fg fg → fg, Ω: pred fg → fg.

Then the implementation of the signature Sig for flow graphs is the following algebra FlowGraph: Dfg = Θ, Dfunc = θfunc, Dpred = θpred, fempty = ε ∈ Θ, felem : Dfunc → Dfg, a a {(N, E, λ) | N = {s, v, t}, E = {(s, l, v), (v, l, t)}, λ(v) = a} fΠ : Dfg× Dfg→ Dfg, (Φa, Φb) a Π[Φa / va][Φb / vb] fΔ : Dpred× Dfg× Dfg→ Dfg, (α, Φa, Φb) a Δ[Φa / va][Φb / vb] fΩ : Dpred× Dfg→ Dfg, (α, Φa) a Ω[Φa / va].

Definition 10 (strong decomposition or well-formedness) A flow diagram Φ = (G, s, t) where G = (N, E, λ) is strongly decomposable (or well-formed in terms of [PKT73] and [EKR08]) if there exists an expression exp in the Sig-algebra FlowGraph such that FlowGraph[[exp]] ≅ Φ.

An example of strong decomposition is shown in Figure 4 (a) where the base subdiagrams are isolated with dashed lines; it can be expressed as follows:

Φ ≅ fΠ( fΩ(a, fΠ( fΔ(b, felem(b1), felem(b2)), felem(c))), felem(d)).

The opposite example of a flow diagram which can not be strongly decomposed is our common example in Figure 1.

Together with a strong decomposition, [BJ66] considered another decomposition which is obtained by operating on an equivalent strongly decomposable flow graph.

(10)

(a) Strongly decomposable graph (b) Syntax tree

Figure 4: Strongly decomposable graph Φ and respective syntax tree

Formally, a flow graph Φ is weakly decomposable if Φ ∼ Φ′ for some strongly decomposable flow graph Φ′.

As example of weak decomposition we introduce Ξ(α, a) denote construction ‘repeat’ in programming languages and the diagram of Figure 5 (a) which cannot be strongly decomposed according to base subdiagrams. But we can define the equivalent strongly decomposable flow diagram Ξ′(α, a) denoted by the diagram of Figure 5 (b) as follows:

Ξ(α, a) ∼ Ξ′(α, a) ≅ fΠ( felem(a), fΩ(α, felem(a))).

(a) Ξ(α, a) (b) Ξ′(α, a) ≅ fΠ( felem(a), fΩ(α, felem(a))) Figure 5: Diagram of Ξ(α, a) and diagram equivalent to Ξ(α, a)

Algebra of syntax trees. The other data structure for representing programming language

constructs by compilers, converters and transformation tools is a tree structure known as an abstract syntax tree. At large, an abstract syntax tree is a data structure consisting of types that represent language constructs connected by sequence and unit valued relationships to other types [OMG05].

In terms of graph theory, an abstract syntax tree is a tree, that is to say, an acyclic graph with a single root node, connecting nodes and leaf nodes. Then, similarly to the graph definition above, we can define a syntax tree as follows.

Definition 11 (syntax tree) An abstract syntax tree, or just syntax tree, is a tuple T = (GT,

root) where

• GT = (NT, ET, λT) is an acyclic connected labeled directed graph; • root ∈ NT is a single root node;

• NT = Nn ∪ Nl such as Nn ∩ Nl = ∅ where Nn is a set of internal nodes and Nl is a set of leaf nodes.

Each node of the syntax tree in our case should denote a construction occurring in the flow diagram. For instance, the base subdiagrams in Figure 3 (a)-(c) may be denoted by

(11)

Figure 6: The types in the syntax trees for our implementation

constructions Seq, IfThenElse and While in Figure 3 (d)-(f), respectively. The different node types that are supported by syntax trees, together with their relationships, are shown in Figure 6.

Let us assume a universe Ψ of arbitrary syntax trees and a set ϑ of syntax tree constructions {Seq, IfThenElse, While}.

Definition 12 (syntax tree substitution) Let T = (GT, root) be an arbitrary syntax tree where GT = (NT, ET, λT) and NT = Nn ∪ Nl. A syntax tree substitution is a mapping Sub: NT → Ψ that maps each leaf node v ∈ Nl, representing functional nodes in the flow graph, to a syntax tree Tv = (G_Tv, root_Tv) where G_Tv = (N_Tv, E_Tv, λ_Tv), and obeys the following rules:

• T [Tv / v] = (GSub, rootSub) is a flow graph, GSub = (NSub, ESub, λSub);

• rootSub = root;

• NSub = (NT \ v) ∪ N_Tv;

• E_Sub = (ET \ ETDel) ∪ E_Tv ∪ EIns where o ETDel = {e ∈ ET: tgt(e) = v},

o EIns = {e_Sub ∈ E_Sub | ∃ e ∈ ETDel: src(e) = src(eSub), λ(e) = λ(eSub);

tgt(eSub) = root_Tv}.

A substitution Sub can be extended to the whole syntax tree as T [Sub] = T [T_v1/ v1] [T_v2/ v2] … [T_vn/ vn].

Note that a syntax tree is empty if N = ∅ and E = ∅. A syntax tree is an elementary a-labelled syntax tree if N = {v}, root = v, λ(v) = a and E = ∅.

Since a signature Sig gives us only the syntax, we can implement it with a different algebra SyntaxTree, which describes operations on syntax trees as follows:

D_Tfg = Ψ, D_Tfunc = θfunc, D_Tpred = θpred, gempty = ε ∈ Ψ, gelem : D_Tfunc → D_Tfg, a _{a (N, E, λ) | N = {v}, λ(v) = a, E = ∅} gΠ : D_Tfg× D_Tfg→ D_Tfg, (Ta, Tb) a Seq[Ta / va][Tb / vb]

(12)

gΔ : D_Tpred× D_Tfg× D_Tfg→ D_Tfg,

(α, Ta, Tb) a IfThenElse[Ta / va][Tb / vb]

gΩ : D_Tpred× D_Tfg→ D_Tfg ,

(α, Ta) a While[Ta / va].

Let us consider now a representation of a flow graph Φ as a syntax tree T, called a syntax tree decomposition.

Definition 13 (syntax tree decomposition) A syntax tree decomposition of a weakly

decomposable flow graph Φ = (G, s, t) is a following morphism:

STD: Φ a {SyntaxTree[[exp]] | FlowGraph[[exp]] ≅ Φ′ ∼ Φ} where Φ′ = (G′, s, t) is an strongly decomposable flow graph equivalent to Φ.

An example of a syntax tree decomposing the flow graph Φ from Figure 4 (a) is shown in Figure 4 (b).

(13)

Chapter 3.

Flow Diagram Decomposition

There exist today many variants on the idea of a structured program. One of the first approaches to restructuring was given by Böhm and Jacopini [BJ66]. Their restructuring method was done in the context of flow diagram decomposition. Some researchers mentioned (see, for instance, [Amm92], [EH94]) that this result is mostly of historical and theoretical interest and does not give a complete algorithm. On the contrary, we believe that their method presents a sufficient set of pattern matching rules and transformations for implementation in terms of graph transformations. We discuss details of this approach in

Section 3.1 and present some results of the implementation in Groove in Chapter 4.

There have been several other approaches to restructuring program flow diagrams. Peterson et al. present a proof that every flow diagram can be transformed into an equivalent strongly decomposable (well-formed) flow diagram [PKT73]. They present a graph algorithm to do such a transformation using a technique of node-splitting and they proved that this transformation is correct. We found this algorithm very useful to improve the quality of flowcharts, especially a process of eliminating multiple entries in the strongly connected components. Application of this algorithm as a part of Böhm-Jacopini approach is discussed in Section 3.2. The complexity measure to evaluate the advantage of the Peterson method is considered in Section 3.3 and some concrete implementation results are presented in Chapter 4.

A concise review of many of other results developed in this field has been prepared in [EH94]. We also come back to that discussion in the closing remarks about future work in

Chapter 5.

3.1. Base subdiagram decomposition

The set of definitions introduced in the previous section is within the scope of the existing graph theory. In this section, we introduce a way to enrich the usual definitions, and so formalize the concepts of flow graph decomposition.

The preliminaries of Böhm-Jacopini method [BJ66] were presented in Chapter 2. In addition to three base subdiagrams Π, Ω and Δ, they introduced three new functions denoted by T, F, K, and a new predicate ω which define a behavior of auxiliary boolean variables set.

The effect of the first two functions T and F is to create a new boolean variable with value true or false, respectively, and the function K deletes the last boolean variable. The predicate ω is verified or not according to whether the last boolean variable value is true or false; the value of the predicate ω is true iff the last boolean variable value is true

Recall that if Path is a set of all possible full paths in the flow graph Φ, then the word representation of Path can be regarded as a language Lang(Φ) defined (in the extended case) over the alphabet Λ ∪ {T, F, K, ω}. Let the node types and their relationships be as it shown in Figure 2.

(14)

Then we can define a ‘satisfiability’ function Sat: Lang(Φ) → Lang*(Φ), where Lang*(Φ) = Lang(Φ) ∪ {ε}, as following: for all words w = (x1 x2 … xi … xj … xn) ∈ Lang(Φ) where xk ∈ Λ ∪ {T, F, K, ω}, k ∈ [1, n] ⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ ω ∉ − + ∈ ∀ ∈ ω = ∈ < − ∈ ∃ = ₊ otherwise }; , , , { : ] 1 , 1 [ and } ~ { \ } , { ; }; , { : ], 2 , 2 [ , if ε ) ( ₁ w K F T x j i k x false true x x F T x j i n j i w k i j j i Sat where . ⎪⎩ ⎪ ⎨ ⎧ = = = otherwise ; if ; if ~ x F x false T x true x

Therefore the language Sat(Lang(Φ)) denotes a set of all full path word representations in the flow graph Φ that satisfy our definitions of new functions T, F, K and predicate ω.

Let us denote a function

Restrict: Lang*(Φ) → Lang*(Φ) \ {T, F, K, ω}

as following: for all words w = (x1 x2 … xi–1 xi xi+1 xi+2 … xn) ∈ Lang*(Φ) where xj ∈ Λ, j

= 1, 2, … i–1 , i+1, …, n and xi ∈ {T, F, K, ω}

Restrict(w) = (x1 x2 … xi–1 xi+2 … xn).

Then a language L̅a̅n̅̅g̅̅̅̅̅(Φ) = Restrict(Sat(Lang(Φ))) is a restricted language of the flow graph Φ over the alphabet Λ.Then we can extend the definition of flow graph equivalence.

Then we can extend the definition of flow graph equivalence.

Definition 14 (equivalence of extended flow graphs) Two flow graphs Φ1 and Φ2

extended by functions T, F, K and predicate ω are equivalent if Restrict(Lang(Φ1)) =

Restrict(Lang(Φ2)).

In the light of this discussion above the definition of weak decomposition can be extended as a decomposition which is obtained by operating on an equivalent strongly decomposable extended flow graph.

For instance, we can define tree words w1, w2, w3 over the alphabet {s, a, a1, a2, b, c,

d, d1, d2, t, next, true, false, T, F, K, ω} for the flow graph Φ in Figure 7 (a): w1 = (s next a

true a1 next T next K next b next d false T next ω true K next d2 next t), w2 = (s next a true

a1 next T next K next b next d true d1 next c next F next ω true K next d2 next t) and w3 = (s

next b next d false T next ω true d2 next t). Thereby we have two flow graph execution

sequences w1, w2∈ Lang(Path), two words satisfied to our CF-grammar Gram w1, w3 ∈

Lang(Gram) and only one word belongs to their intersection w1 ∈ Lang(Φ) =

Lang(Path) ∩ Lang(Gram). The reduction function application is follows: Reduct(w1)

= (s next a true a1 next b next d false d2 next t) that is the same as in example in Chapter 2.

In the large, the flow graph in Figure 1 and two extended flow graphs in Figure 7 (a)-(b)

are equivalent.

Theorem 1 (weak decomposition of flow graphs) For any flow graph Φ1 there is (at

least) one equivalent strongly decomposable flow graph Φ2 extended by the functions K, T,

(15)

(a) Böhm-Jacopini decomposition (b) Decomposition using SCC method

Figure 7: Two strongly decomposable (well-formed) extended flow graphs

equivalent to the flow graph in Figure 1

Proof Sketch. The proof suggested in [BJ66] (and the decomposition algorithm proper) is based on the flow diagram classification represented in Figure 8 (a)-(c).

The main idea of this classification is to define a type of the flow diagram first element (functional or predicative) and a rest part of the diagram (inside the dashed lines in

Figure 8 (a)-(c)) that is called A, B and C. The edges marked 1 and 2 denote an aggregated set of edges from nodes inside A, B and C structures to a first element or to a last element of the flow diagram, respectively. The edges 1 and 2 may not always both be present; nevertheless, from every A, B and C structures at least one edge 1 or 2 must start.

For instance, the flow graph Φ = Π(Ω(a, Π(Δ(b, b1, b2), c)), d) in Figure 4 (a) can be

considered as a type II graph where a is a first predicative node, A is a subgraph of Φ with nodes {b, b1, b2, c} and edges between them, B = {d}, a structure A has edge (c, next, a)

marked 2 and no edges marked 1, a structure B has edge (d, next, t) marked 1 and no edges marked 2.

(a) Structure of a type I diagram

(b) Structure of a type II diagram (c) Structure of a type III diagram

(16)

(a) Transformation of a type I diagram

(c) Transformation of a type III diagram

(b) Transformation of a type II diagram

Figure 9: Transformation of three types of flow diagrams

to the equivalent extended strongly decomposable flow diagrams

The equivalent strongly decomposed flow diagrams of type I and type II diagram are shown in Figure 9 (a)-(b). The case of the diagram of type III may be dealt with as case II by substituting Figure 9 (c), where C' indicates that subpart of C accessible from the upper entrance, and C" that part accessible from the lower entrance.

Let Φ1 be a flow graph of type I as it shown in Figure 8 (a). It is obvious that

Reduct(Lang(Φ1)) = Lang(Φ1) = Lang(Path1) = {(a next A (next|true|false)2)* a next

A (next|true|false)1} where “*” denotes matching of the preceding element zero or more

times and (next|true|false)k, k = 1,2, denotes edges marked 1 or 2, respectively.

Let Φ2 be a strongly decomposable flow graph extended by the functions K, T, F and

predicate ω as it shown in Figure 9 (a). Then Lang(Path2) = {T next K next a next A

((next|true|false)1 T next | (next|true|false)2 F next))* true K next} and Lang(Gram2) is a

CF-language generated by the CF-grammar Gram over the alphabet {a, A, next, true, false, T, F, K, ω}. Therefore Lang(Φ2) = Lang(Path2) ∩ Lang(Gram2) = {T next (K next a

next A (next|true|false)2 F next ω false)* K next a next A (next|true|false)1 T next ω true K

next}and Reduct(Lang(Φ2)) = {(a next A (next|true|false)2)* a next A (next|true|false)1}

= Reduct(Lang(Φ1)), so Φ1 and Φ2 are equivalent. Similarly, the same operations can be

applied to a type II flow graph.

If it is assumed that A and B are, by inductive hypothesis, strongly decomposable, then the theorem is proved.

Applying the Böhm-Jacopini decomposition to the example in Figure 1, the extended flow diagram shown in Figure 7 (a) is generated.

3.2. SCC Decomposition

Peterson et al. present the algorithm enabled to improve characteristics of Böhm-Jacopini method in case if flow graph consists of strongly connected components with multiple

(17)

entry points [PKT73].

Theorem 2 (well-formed flow diagram) Every flow diagram can be transformed into an equivalent strongly decomposable (well-formed) flow diagram by node duplication (proof see [PKT73]).

In the proof of this theorem the authors presented the following algorithm that examines strongly connected components for multiple entry points and removes extra entry points by node duplication.

Suppose we have a flow diagram that contains a strongly connected component U with multiple entry nodes. Choose one, X, to become the unique entry node; the others, Y1,

Y2, …, to be removed by node duplication. Next, introduce new nodes Y1', Y2', ... and

remove any entry branches from Y1, Y2, … and connect them to Y1', Y2', ..., respectively.

Now for each primed node Z', including the ones introduced in this step, if the original node Z connects to a node W outside U, place a branch representing the same processing from Z' to W. If Z connects to X, then connect Z' to X with a branch representing the same processing. If Z connects to any other node W of U, then make a new node W' if this hasn't already been done, and connect Z' to W' with a branch representing the same processing as the branch from Z to W.

It can be seen that the new flow diagram and the old one are equivalent, and the old flow diagram can be reconstructed if the corresponding primed and unprimed nodes are merged in the new one.

Let us come back to our main example in Figure 1 where nodes b, c, d and d1 form a

strongly connected component, and b and c are multiple entry nodes. If b is chosen as the entry node and c is duplicated, the well-structured flow diagram with the extended flow graph shown in Figure 7 (b) results. This turns out to be the better choice because this flow graph is intuitively ‘better’ than the flow graph in Figure 7 (a).

But if c is chosen as the entry node and b is duplicated, some more duplicating steps are necessary, and after four steps we can obtain the same flow graph as in Böhm-Jacopini method shown in Figure 7 (a), as well as three different flow graphs not shown here.

The fact that there are many variants of equivalent flow graphs, and some of them are ‘better’ than another, brings us to the issue of complexity measuring presented in the next section.

3.3. Complexity measuring

Maintenance typically required more resources than new software development. For years researchers have tried to understand how programmers comprehend programs. The literature provides two approaches to comprehension: cognitive models that emphasize cognition by what the program does (a functional approach) and a control-flow approach which emphasizes how the program works. A modern state of the art of this direction is reflected in the reviews [CV06], [WY96].

A well-known and often used measure was proposed by McCabe in [McC76].

Definition 15 (cyclomatic number) The cyclomatic number v(Φ) of a flow graph Φ with n nodes, e edges, and p connected components is

(18)

McCabe suggested to measure the complexity of a program by computing the number of linearly independent paths v(Φ), control the "size" of programs by setting an upper limit to v(Φ) (instead of using just physical size), and use the cyclomatic complexity as the basis for a testing methodology.

In [Mil72] the following was proved: if the number of functions and predicates in a structured program is θ and π, respectively, and e is the number of edges, then e = 1 + θ + 3π.

Since n = θ + 2π + 2 and assuming p = 1, we get that the cyclomatic complexity of a structured program equals the number of predicates plus one:

v(Φ) = π + 1.

In addition, McCabe proposed a method of measuring the "structuredness" of a program as follows.

Definition 16 (decomposition degree) Let Φ be a flow graph some subgraphs of which are strongly decomposed flow graphs Φ1, Φ2, …, Φk, and Φi = Θi(θi 1, θ i 2, …, θ i l) where

(as in Definition 8) θi j = θi j (x1, …, xh, y, θ i 1, … θ i l–1), i = 1,2, …, k, j = 1, 2, …, l; xq ∈ N, q = 1, 2, …, h and y ∈ Γ′ = {Π, Δ, Ω, Ξ}. Then a decomposition degree m(Φ) of a flow graph Φ is the number of θi j such that y ∈ Γ′ \ {Π}.

Definition 17 (essential complexity) Let m(Φ) be a decomposition degree of a flow graph Φ. Then the following definition of essential complexity ve(Φ) is used to reflect the lack of

structure:

ve(Φ) = v(Φ) – m(Φ).

In the large, we propose to measure a full complexity of the flow diagram as follows:

Definition 18 (full complexity) Let v(Φ) be the cyclomatic number, ve(Φ) - the

essential complexity number and vd(Φ) - the number of duplicated nodes in a flow graph

Φ. Then the following defines the full complexity V(Φ):

V(Φ) = [v(Φ) + vd(Φ)] × ve(Φ).

This formula stresses that the full complexity of a flow diagram is equal to the summation of its cyclomatic number and number of duplicates. The multiplication dictates that the full complexity and essential complexity of a flow diagram must be in the same order of magnitude.

Let us illustrate all of that complexity measuring by our main example shown in

Figure 1. The initial flow diagram contains two predicates, therefore v = 3, ve = 3, vd = 0

and V = (3 + 0) × 3 = 9. If we apply the straight Böhm-Jacopini method the final flow diagram shown in Figure 7 (a) has v = 6, ve = 1, vd = 4 and V = (6 + 5) × 1 = 11. The ‘best

choice’ of SCC method represented in Figure 7 (b) has v = 4, ve = 1, vd = 1 and V = (4 + 1)

× 1 = 5. Other four flow graphs obtained by SCC method have V = 6, V = 11, V = 12 and V = 12, respectively.

Hereby, the introduced full complexity measure V reflects an intuitive notion of readability and enables us to compare the final syntax trees and minimize their complexity.

(19)

Chapter 4.

Graph Transformations:

Implementation within GROOVE

Graph transformation is a systematic, rule-based transformation technique. It has a solid research foundation [EEP06] and applications in many areas in computer science.

4.1. Graph transformations

A graph production system (GPS) is a set of graph production rules, each of which can transform a source graph into a new graph called the target graph. The rule specifies both the conditions under which it applies and the changes it makes to the source graph. Technically, a graph production rule consists of two partially overlapping graphs, a left hand side L and a right hand side R, and a set of negative application conditions N, which are also (connected) graphs partially overlapping with L. In order to apply the rule, the left hand side L is matched to (a part of) the source graph G, after which the image of L in G is replaced by a copy of R; but a matching is only valid if it cannot be extended to any of the graphs in N – in other words, the structure in the negative application conditions is forbidden in the source graph.

In our visual presentation of a rule used in this paper (which is taken from the Groove tools) we combine all these elements together into one graph, made up of four types of elements:

• Readers: elements present in both L and R. They have to be present in the source graph for L to match and are preserved in the target graph;

• Erasers: elements present in L but not in R. They are matched in the source graph but are not preserved in the target graph, i.e. they are removed.

• Creators: elements absent in L but present in R. They are introduced to the target graph.

• Embargoes: elements absent in L but present in one of the negative application conditions in N.

To distinguish these four types visually, each element has a distinct color and form, as shown in Figure 10: readers are black, erasers are dashed blue (darker gray in black-and-white presentations) creators are bold green (light gray in black-black-and-white presentations) and embargoes are bold, dashed red (dark gray in black-and-white presentations).

(a) Reader (b) Eraser (c) Creator (d) Embargo

(20)

4.2. Implementation within GROOVE

We implemented techniques described in Chapter 3 within the Groove (see [Ren04], [EKR08], [KR08]) framework, a standard tool for graph transformations. This allowed a more thorough exploration of more examples and for a qualified judgment on practical scalability.

The flow diagram decomposition rules construct a syntax tree by contracting and transforming a flow diagram. In this transformation process, syntax tree elements are introduced to the flow diagram and flow diagram elements are contracted (iteratively) to one node. Our flow diagram decomposition approach consists of following issues.

Flow diagram and syntax trees. On the first step of our transformations we copy the initial flow diagram Φ to create the same structure for the syntax tree T such that:

• every node v ∈ Φ has a node image Im(v) ∈ T with the same node label;

• every edge (v1, v2) ∈ Φ has an edge image Im(v1, v2) ∈ T in the syntax tree with

the edge label flow;

• every node v ∈ Φ has to be connected with its image Im(v) ∈ T by edge (v, Im(v)) with the edge label synt;

• for each rule we create and support a connection between flow diagram elements and syntax tree elements by edges with the label synt.

Contraction rules. For each type of elementary flow diagrams Π, Ω, Δ and Ξ, we design one flow diagram contraction rule that introduce the necessary syntax tree elements and contracts elementary flow diagram to one node. Two examples of flow diagram contraction rules for the IfThenElse (Δ) and While (Ω) statements are shown in Figure 11 (a)-(b) (compare to Figure 3 (b)-(c) and then Figure 3 (e)-(f), respectively).

The notation for contraction rules in Figure 11 is follows:

• the node types are used in compliance with the notation shown in Figure 10;

(b) Contraction rule for While (Ω) statement

(a) Contraction rule for IfThenElse (Δ) statement

(21)

• to itemize all possible edge labels for this connection we are using separate labels or regular expressions such as “?[next, true, false]”;

• to emphasize discrepancy of two nodes we are using an edge with the label “!=” between them;

• type diagram for the flow diagram is shown in Figure 2; • type diagram for the syntax tree is shown in Figure 6.

Decomposition rules. The flow diagram decomposition process operates top-down, starting from the root-node of the flow diagram under construction and choosing an appropriate type of flow diagram (I, II or III) as was discussed in Section 3.1. Figure 12 (a)-(b) shows how a first step of a decomposition process for graph types I and II-III, respectively, is resolved (compare to Figure 8 (a)-(c)) and a general last step for all graph types is shown in Figure 12 (c) (compare to Figure 9 (a)-(c)).

The notation for decomposition rules in Figure 12 is follows:

• the node types are used in compliance with the notation shown in Figure 10; • to itemize all possible edge labels for this connection we are using separate labels or regular expressions such as “?[next, true, false]”;

• to emphasize discrepancy of two nodes we are using an edge with the label “!=”

(a) First step for a type I diagram (b) First step for a type II-III diagram

(d) The final rule to get a syntax tree (c) Last step for all types of diagrams

(22)

between them;

• to define a logical behavior we are using the universal quantifier ∀ and existential quantifier ∃; the quantifier ∀>0_{restricts usability of universal quantifier}

to existing nodes;

• type diagram for the flow diagram is shown in Figure 2.

The flow diagram decomposition process shown in Figure 12 operates as following:

• on the first step of the decomposition process the new node Group is added and it should be connected with the first and last flow diagram elements by edges first and last, respectively (Figure 12 (a)); in the case of types II or III it also should be connected with two first elements in conditional branches true and false by edges t_in and f_in, respectively (Figure 12 (b));

• in the case of types II or III we have the iterative process to find all elements in conditional branches true and false and connect them with node Group by edges t_in and f_in, respectively;

• on the last step of the decomposition process the node duplication for nodes located in both conditional branches true and false and the extension of the flow diagram by special functions T, F, K and predicate omg is executing (Figure 12 (c)).

SCC rules. To improve readability of the flow diagrams, we also use strongly connected component (SCC) decomposition rules as it was discussed in Section 3.2.

Bottom-up and top-down decomposition. In general, the flow diagram contraction and decomposition process operates in both directions: while an extraction of elementary flow diagram is possible, we are applying one of contraction rules and have a bottom-up process; otherwise we are applying one of decomposition rules and have a top-down decomposition (the terms of top-down and bottom-up decomposition are used in compliance with formal language theory [AU73]).

Syntax trees. On the last step of our transformation we delete the contracted flow diagram elements and get a final syntax tree (see Figure 12 (d)). The type diagram for the syntax tree is shown in Figure 6.

Unfortunately, we cannot explain the precise workings of the Groove implementation in the available space; however, the rules and some example cases are available in

Appendix A for the reader to try out.

The example of final syntax tree for the straight Böhm-Jacopini method applying to the initial flow diagram in Figure 1 is shown in Figure 13 (a) and has v = 6, ve = 1, vd = 4

and V = (6 + 5) × 1 = 11. The best of five final syntax trees corresponding to that initial diagram obtained by the nondeterministic SCC method (see Section 3.2) is shown in

Figure 13 (b) and has v = 4, ve = 1, vd = 1 and V = (4 + 1) × 1 = 5. The text code

representation corresponding to the final syntax trees in Figure 13 (a)-(b) are presented in

Figure 13 (c)-(d), respectively.

Some example results for the complexity measuring implementation are given in Table 1. From the table, we can observe that (as expected) the SCC method always yields results at least as good as, and in all larger cases better than, the Böhm-Jacopini method. The detailed description of examples is available in Appendix B.

(23)

Table 1. Example cases for the complexity measuring implementation (n is the number of nodes in the flow graph and V is the complexity measure proposed in the Section 3.3). The bold line (case #3) represents the example from Figure 1.

Böhm-Jacopini method (deterministic)

Initial _{SCC method (non-deterministic)}

(a) The final syntax tree with V = 11 (b) The final syntax tree with V = 5

begin if a then begin a1; var_bool := true; repeat b; if d then begin d1; c; var_bool := false; end else var_bool := true; until var_bool; end else begin a2; var_bool := true; repeat c; b; if d then begin d1; var_bool := false; end else var_bool := true; until var_bool; end; d2; end. begin if a then a1; else begin a2; c; end; var_bool := true; repeat b; if d then begin d1; c; var_bool := false; end else var_bool := true; until var_bool; d2; end.

(c) The text code representation of tree (a) (d) The text code representation of tree (b)

Figure 13: Two final syntax tree examples and text code representation

flow graph Case Min V Max V # Result count n V n V n V n V 1 8 3 8 3 1 8 3 8 3 2 9 9 12 4 1 12 4 12 4 3 10 9 26 11 5 17 5 32 12 4 14 36 38 18 12 25 11 63 29 5 50 156 82 64 52 71 32 82 64 6 100 276 237 154 72 112 84 289 312

(24)

Chapter 5.

Conclusions and Related Work

In this paper we take a first step towards an implementation of existing flow graph decomposition methods using graph transformations.

As stated in the introduction, well-structuredness was one of our main guidelines. We investigated several alternative and mutually complementary classical methods of flow diagram decomposition. We implemented the Böhm-Jacopini approach in terms of graph transformations employing the graph-transformation tool Groove. For the implementation we used an extended concept of equivalent flow graphs defined through the notion of context-free languages.

The Böhm-Jacopini decomposition method was enhanced and improved by using the Peterson et al. method that examines strongly connected components for multiple entry points and removes extra entry points by node duplicating.

In the introduction we stated that the well-structuredness of models is very important. Our full complexity measuring of a flow diagram reflects an intuitive notion of readability and enables us to compare the final syntax trees to evaluate different decomposition methods and different results of non-deterministic methods and minimize their complexity.

An important issue is to expand the set of implemented methods and apply them to improve software reliability and readability, for instance in model transformations from UMLA to Java programs. A concise review of many of other results developed in this field has been prepared in [EH94].

The described approach is still work in progress. The applying well-formed structures is just the first step in the general decomposition approach: the next step is to review the different cases of flow graphs with parallelism and loops and develop universal method similar simple flow graphs without parallelism.

In general, we intend to investigate the applicability of our framework to enhance a model transformation from UMLA to structured models and formally prove the correctness of this transformation. After enriching that model transformation, our long-term goal is to implement the same methods to transformations from UMLA to business process execution languages.

Acknowledgements: The research in this paper was carried out in the GRASLAND project, funded by the Dutch NWO (project number 612.063.408).

(25)

Bibliography

[AU73] Aho, A.V., and Ullman, J.D. The Theory of Parsing, Translation and Compiling. Prentice Hall, Englewood Cliffs, N.J., 1973.

[Aig97] Aigner, M. Combinatorial theory. Springer-Verlag Berlin Heidelberg, 1997. [All70] Allen, F.E. Control Flow Analysis. ACM Sigplan Notices, 1970.

[Amm92] Ammarguellat, Z. A control-flow normalization algorithm and its complexity. IEEE Transaction on software engineering, Vol. 18, No. 3, pp. 237–251, 1992.

[AM71] Ashcroft, E., and Manna, Z. The translation of 'GOTO' programs to 'WHILE' programs. Proceedings of IFIP Congress, Ljubljana, Yugoslavia, pp. 250-255, 1971.

[Bak77] Baker, B. An algorithm for structuring flowgraphs. Journal of the ACM, Vol. 4, No. 1, pp. 98-120, 1977.

[Ber73] Berge, C. Graphs and Hypergraphs. Amsterdam, The Netherlands: North-Holland, 1973.

[BJ66] Bohm, C., and Jacopini, G. Flow diagrams, Turing machines and languages with only two formation rules. Communications of ACM, Vol. 9, No. 5, pp. 366-371, 1966.

[BS65] Busacker, R.G., and Saaty, T.L. Finite Graphs and Networks: An Introduction with Applications. McGraw-Hill Book Co., New York, 1965. [CV06] Collar, E., and Valerdi R. Role of Software Readability on Software

Development Cost. 21st Forum on COCOMO and Software Cost Modeling, Herndon, VA, 2006.

[EEP06] Ehrig, H., Ehrig, K., Prange, U., and Taentzer, G. Fundamentals of Algebraic Graph Transformation. Springer, 2006.

[EKR08] Engels, G., Kleppe, A.G., Rensink, A., et. al. From UML Activities to TAAL - Towards Behavior-Preserving Model Transformations. Proceeding of the European Conference on Model Driven Architecture - Foundations and Applications (ECMDA-FA). Lecture Notes in Computer Science 5095, Springer Verlag, Berlin, Germany, pp. 94-109, 2008.

[EH94] Erosa, A.M., and Hendren L.J. Taming control flow: A structured approach to eliminating goto statements. Proceedings of the 1994 International Conference on Computer Languages, Toulouse, France, pp 229–240, 1994. [Har89] Harary, F. Graph Theory. Addison-Wesley, Canada, 1989.

[KR08] Kleppe, A.G., and Rensink, A. A Graph-Based Semantics for UML Class and Object Diagram. Technical Report TR-CTIT-08-06 Centre for Telematics and Information Technology, University of Twente, Enschede, 2008.

[LPW02] Linger, R., Pleszkoch, M., Walton, G., and Hevner, A. Flow-Service-Quality (FSQ) Engineering: Foundations for Network System Analysis and Development. Pittsburgh, PA: Software Engineering Institute, Carnegie Mellon University, 2002.

[McC76] McCabe T. A Complexity Measure. IEEE Transactions on Software Engineering, Vol. 2, No. 4, pp. 308-320, 1976.

[Mil72] Mills, H.D. Mathematical foundations for structured programming. IBM Federal System Div., Gaithersburg, 1972.

(26)

Proposals (RFP). http://www.omg.org/cgi-bin/doc?admtf/05-02-02.pdf, 2005.

[PKT73] Peterson, W.W., Kasami, T., and Tokura, N. On the capabilities of while, repeat and exit statements. Communications of ACM, Vol. 16, No. 8, pp. 503-512, 1973.

[Ren04] Rensink, A.. The GROOVE Simulator: A Tool for State Space Generation. Pfaltz, J.L., Nagl, M., Bohlen, B. (eds.) AGTIVE 2003. LNCS, vol. 3062, pp. 479–485. Springer, Heidelberg, 2004.

[WY96] Woods, S., and Yang Q. The Program Understanding Problem: Analysis and a Heuristic Approach. Proceedings of the 18th international conference on Software engineering (ICSE), IEEE, Berlin, Germany, pp. 6 – 15, 1996. [ZHB05] Zhao, W., Hauser, R., Bhattachaya, K., and Bryant B. Compiling Business

Processes: Untangle Unstructured Loops in Irreducible Flow Graphs. Technical report UABCIS-TR-2005-0505-1, Department of Computer and Information Sciences, University of Alabama at Birmingham, 2005.

(27)

Appendix A.

GROOVE Rule Description

1. Flow diagram and syntax trees. On the first step of our transformations we copy the initial flow diagram Φ to create the same structure for the syntax tree T such that:

• every node v ∈ Φ has a node image Im(v) ∈ T with the same node label;

• every edge (v1, v2) ∈ Φ has an edge image Im(v1, v2) ∈ T in the syntax tree with the

edge label flow;

• every node v ∈ Φ has to be connected with its image Im(v) ∈ T by edge (v, Im(v)) with the edge label synt;

• for each rule we create and support a connection between flow diagram elements and syntax tree elements by edges with the label synt.

Priority 12. CopyNodes

(28)

2. Contraction rules. For each type of elementary flow diagrams Π, Ω, Δ and Ξ, we design one flow diagram contraction rule that introduce the necessary syntax tree elements and contracts elementary flow diagram to one node.

The notation for contraction rules is follows:

• the node types are used in compliance with the paper notation;

• to itemize all possible edge labels for this connection we are using separate labels or regular expressions such as “?[next, true, false]”;

• to emphasize discrepancy of two nodes we are using an edge with the label “!=” between them.

Priority 10.

IfThenElseNodes

(29)

SequenceNodes

(30)

3. Decomposition rules. The flow diagram decomposition process operates top-down, starting from the root-node of the flow diagram under construction and choosing an appropriate type of flow diagram (I, II or III) as was discussed in the paper.

The notation for decomposition rules is follows:

• the node types are used in compliance with the paper notation;

• to itemize all possible edge labels for this connection we are using separate labels or regular expressions such as “?[next, true, false]”;

• to emphasize discrepancy of two nodes we are using an edge with the label “!=” between them;

• to define a logical behavior we are using the universal quantifier ∀ and existential quantifier ∃; the quantifier ∀>0_{restricts usability of universal quantifier to existing}

nodes.

The flow diagram decomposition process operates as following:

• on the first step of the decomposition process the new node Group is added and it

should be connected with the first and last flow diagram elements by edges first and last, respectively; in the case of types II or III it also should be connected with two first elements in conditional branches true and false by edges t_in and f_in, respectively;

• in the case of types II or III we have the iterative process to find all elements in conditional branches true and false and connect them with node Group by edges

t_in and f_in, respectively;

• on the last step of the decomposition process the node duplication for nodes located in both conditional branches true and false and the extension of the flow diagram by special functions T, F, K and predicate omg is executing.

Priority 9.

(31)

GroupType2-Normalization

Priority 8.

(32)

GroupType1-Step2

GroupType2-3-Step1

(33)

GroupType3-Step3-In

(34)

Priority 7.

GroupType2Normal-2Edges

Priority 6.

GroupType-CloseGroup

(35)

4. Syntax trees. On the last step of our transformation we delete the contracted flow diagram elements and get a final syntax tree.

Priority 5.

(36)

Appendix B.

Examples of Implementation

within GROOVE

Example 1. An example of strongly decomposable flow graph with n = 8 nodes. In this case

v = 3, ve = 1, vd = 0 and V = (3 + 0) × 1 = 3. Normalization methods yield the same results.

Example 2. The example of not strongly decomposable flow graph with n = 9 nodes from the paper (Peterson, W.W., Kasami, T., Tokura, N., 1973. On the capabilities of while, repeat and exit statements. In Communications of ACM, Vol. 16, No. 8, pp. 503-512). For the initial flow graph v = 3, ve = 3, vd = 0 and V = (3 + 0) × 3 = 9. The final syntax tree for the

straight Böhm-Jacopini method has v = 4, ve = 1, vd = 0 and V = 4. The SCC method is not

applicable in this case.

Example 3. The example of a flow graph with n = 10 nodes which was used throughout the paper. For the initial flow graph v = 3, ve = 3, vd = 0 and V = 9. The final syntax tree for the

straight Böhm-Jacopini method has v = 6, ve = 1, vd = 4 and V = (6 + 5) × 1 = 11. The best

of five final syntax trees obtained by the nondeterministic SCC method has v = 4, ve = 1, vd

(37)

Example 4a. The example of a real flow graph with n = 14 nodes as provided in (JSPWiki, 2008. Trouble Ticket Scenario, http://www.xpdl.org/wiki/Wiki.jsp?page= TroubleTicketScenario) by Workflow Management Coalition without fork/join part. For the initial flow graph v = 6, ve = 6, vd = 0 and V = 36. The final syntax tree for the straight

Böhm-Jacopini method has v = 12, ve = 1, vd = 6 and V = 18. The best of 12 final syntax

trees obtained by the nondeterministic SCC method has v = 8, ve = 1, vd = 3 and V = 11.

Example 4b. The example of a real flow graph with n = 17 nodes as provided in (JSPWiki, 2008. Trouble Ticket Scenario, http://www.xpdl.org/wiki/Wiki.jsp?page= TroubleTicketScenario) by Workflow Management Coalition with fork/join part. For the initial flow graph v = 10, ve = 10, vd = 0 and V = 100. The final syntax tree for the straight

Böhm-Jacopini method has v = 12, ve = 6, vd = 4 and V = 96. The best of 12 final syntax

(38)

Example 5. This example (a flow graph with n = 50 random nodes and random edges) is interesting as a performance and scaling test case. For the initial flow graph v = 14, ve = 11, vd = 0 and V = 156. The final syntax tree for the straight Böhm-Jacopini method has v = 27, ve = 1, vd = 37 and V = 64. The best of 52 final syntax trees obtained by the

(39)

Example 6. This example (a flow graph with n = 100 random nodes and random edges) is interesting as a performance and scaling test case. For the initial flow graph V = 276. The final syntax tree for the straight Böhm-Jacopini method has V = 154. The best of 72 final syntax trees obtained by the nondeterministic SCC method has V = 84.