Formal analysis of consensus protocols in asynchronous distributed systems

(1)

Formal analysis of consensus protocols in asynchronous

distributed systems

Citation for published version (APA):

Atif, M. (2009). Formal analysis of consensus protocols in asynchronous distributed systems. (Computer science reports; Vol. 0916). Technische Universiteit Eindhoven.

Document status and date: Published: 01/01/2009

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at:

openaccess@tue.nl

(2)

Formal Analysis of Consensus Protocols in

Asynchronous Distributed Systems

Muhammad Atif 16th October 2009

Abstract

This paper presents a formal verication of two consensus protocols for distributed systems presented in [T. Deepak Chandra and S. Toueg, Unreliable failure detectors for reliable distributed systems, J. ACM, 1996]. These two protocols rely on two underlying failure detection protocols. We formalize an abstract model of the underlying failure detection protocols and building upon this abstract model, formalize the two consensus protocols. We prove that both algorithms satisfy the properties of uniform agreement, uniform integrity, termination and uniform validity assuming the correctness of their corresponding failure detectors.

1 Introduction

In a consensus protocol, each participating process proposes a value and eventually all (non-crashed) processes should reach a state in which they decide upon the same value. The decided value has to be chosen from the set of proposed values by the participating processes [3]. In an asynchronous environment, there is no upper bound on the delay of (reliable) commu-nication channels; hence, a process cannot distinguish between a crashed process, for whose proposed value it does not have to wait, and a process connected to a very slow communication channel, whose proposed value has to be taken into account in the nal result of the consensus. This forms the basic argument behind the impossibility of solving the consensus problem in an asynchronous environment in the presence of crash failures [4].

To circumvent this problem, the consensus protocols are built upon fail-ure detectors, which by a synchronization mechanism can provide us with information about crashed (i.e., permanently halted) and correct processes. Upon query at any given time, the failure detector of each process outputs the list of its suspected processes. The information provided by a failure detector is not necessarily accurate and hence, failure detectors can only suspect other processes. The unreliable failure detectors are in turn the

(3)

result of unbounded delays in the asynchronous communication channels. Hence, at each moment of time, the output of any two failure detectors can be dierent.

We formalize and verify two algorithms (also called protocols) for solv-ing the consensus problem proposed by [1]; one uses strong completeness with weak accuracy and the other uses strong completeness with eventual weak accuracy. Strong completeness refers to suspecting all crashed pro-cesses, i.e., after a certain amount of time every correct process permanently suspects each crashed process. Weak accuracy means that some correct pro-cess is never suspected. Eventual weak accuracy means that after a certain amount of time, some correct process is never suspected. The rst consensus protocol, relying on strongly complete and weakly accurate failure detec-tors, tolerates N − 1 number of process-failures (N is the total number of processes in asynchronous systems) whereas the one, relying on a strongly complete and eventually weakly accurate failure detector, requires a major-ity of processes to be correct [1]. If the network guarantees the said number of processes to be correct, we prove that both consensus algorithms satisfy functional requirements of uniform agreement, uniform integrity, termina-tion and uniform validity, to be dened precisely in the remainder of this report.

Structure of the paper. We give an informal description of two con-sensus protocols in Sections 2.2 and 2.3 and process-algebraic specications of them in Sections 3.2 and 3.3, respectively. The requirements of the pro-tocols and their results are presented in Section 4. The paper is concluded in Section 5.

2 Consensus Protocols

Consensus protocols ensure that all correct processes eventually reach a con-sensus on one value, called the decided value. The decided value is always selected from a set of values, to which every process (at the beginning of the protocol) contributes one value, called the proposed value, to this set. The process will not come to a decision if it fails by crashing, i.e., permanently halting. A failure pattern, denoted by F in the remaining text, is a func-tion from T to 2π _{where T is the set of natural numbers, denoting discrete}

time, and π = {p1, p2, . . . , pn} is the set of participating processes. During

the execution of the protocols, a failure detector D makes (possibly unreli-able) information available about the failure pattern F . Next we explain the general assumptions on which the forthcoming algorithms rely.

2.1 General assumptions

1. If a process is crashed, it will never recover. Assume that F (t) denotes the set of crashed processes up to time t then F (t) ⊆ F (t + 1).

(4)

2. All failure detectors are unreliable. This means that they can suspect correct processes or unsuspect crashed processes at any time. Hence, in general for each process p, H(p, t) is unrelated to H(p, t + 1) where H is a function from π × T to 2π for failure detector history and it provides the history of a failure detector Dp up to time t, i.e., a timed

trace of lists of processes suspected by pi up to time t. It is assumed

that there is a discrete global clock that acts as a ctional device and the processes do not have access to it. Due to unreliability of failure detectors, it is also possible for two distinct processes p and q that H(p, t) 6= H(q, t)at some time t.

3. A solution for the consensus problem is proposed in the setting of asynchronous distributed systems in which there is no upper bound on:

(a) message delays, (b) clock drifts, and

(c) the amount of time necessary to execute a step.

4. The failure detectors of all correct process participants satisfy strong completeness, i.e., eventually every crashed process is permanently sus-pected by their failure detectors. Due to [1], the following formula for-malizes this description.

∀F, ∀H ∈ D(F ), ∃t ∈ T, ∀p ∈ crashed(F ), ∀q ∈ correct(F ), ∀t0≥ t : p ∈ H(q, t0)

D(F ) is a set of failure detector histories and correct(F ) = π − crashed(F )where crashed(F ) = S_t∈TF (t).

5. Although the failure detectors are unreliable, they are assumed to sat-isfy some notion of accuracy. A failure detector is weakly accurate when some correct process is never suspected; it is eventually weakly accurate, if it eventually never suspects some correct process. The fol-lowing formula, due to [1], formalizes this description.

∀F, ∀H ∈ D(F ), ∃p ∈ correct(F ), ∀t ∈ T, ∀q ∈ π − F (t) : p /∈ H(q, t) 6. The consensus algorithm that relies on strong completeness with weak accuracy can tolerate any number of process failures whereas the other consensus algorithm requiring strong completeness and eventual weak accuracy, requires the majority of the process to be correct.

(5)

Along with the property of strong completeness, the algorithms discussed in Sections 2.2 and 2.3 rely on the above assumptions together with the properties of weak accuracy and eventual weak accuracy, respectively. 2.2 Solving consensus using strong completeness and weak

accuracy

This algorithm assumes the properties of strong completeness and weak ac-curacy and solves the consensus problem in an asynchronous system provided that at least one correct process is never suspected by any failure detector. The algorithm has three phases and each process, if it remains operational, is supposed to go through all phases (from the rst to the last). Suppose that n is the total number of processes in the network. In the rst phase, each (non-crashed) process p executes n − 1 rounds. In every round each process broadcasts a message that contains its proposed value vp and then receives

the same type of message from other non-suspected processes. At the end of this phase, every process updates its set of proposed values. These values are obtained either directly from other processes or indirectly in that some processes are correct but erroneously suspected.

In the second phase, all correct processes exchange their sets of values and make them identical to each other by dropping values that are not part of some received set. In the third and last phase, each process decides the rst available value in its set. The algorithm for solving the consensus problem using strong completeness and weak accuracy, due to [1], is given below such that every process p executes it with a distinct proposed value vp.

(6)

Algorithm 1 Process(vp)

Vp:= h⊥, ⊥, . . . , ⊥i{p's estimate of the proposed values}

Vp[p] := vp

∆p:= Vp{To send/receive proposed values}

Phase 1: {Asynchronous rounds rp, 1 ≤ rp≤ n − 1}

for rp= 1to n − 1 do

send (rp, ∆p, p)to all

wait until [∀q : received (rp, ∆q, q)or q ∈ Dp] {Query the failure detector and get

Dp, i.e., a set of suspected processes. If q 6∈ Dp then receive message from q for

round rp}

msgsp[rp] := {(rp, ∆q, q) |received (rp, ∆q, q)}

∆p:= h⊥, ⊥, . . . , ⊥i

for k = 1 to n do

if Vp[k] =⊥and ∃(rp, ∆q, q) ∈ msgsp[rp]with ∆q[k] 6=⊥then

Vp[k] := ∆q[k]

∆p[k] := ∆q[k]

end if end for end for

Phase 2: send Vpto all

wait until [∀q : received Vq or q ∈ Dp]

lastmsgsp:={Vq|received Vq}

for k = 1 to n do

if ∃Vq∈ lastmsgspwith Vq[k] =⊥then

Vp[k] :=⊥

end if end for Phase 3:

decide (rst non-⊥ element of Vp)

2.3 Solving consensus using strong completeness and even-tual weak accuracy

In the previous section, we gave the algorithm to solve consensus using strong completeness and weak accuracy where at least one process was supposed to be correct. Now we introduce the algorithm, proposed in [1], to solve the same problem with strong completeness and eventual weak accuracy. This algorithm demands a majority of processes to be correct. The protocol is executed in rounds and in each round, there is a unique coordinator, namely, the one with identier c = (r mod n) + 1. If a process is correct, which may or may not be suspected, it eventually decides some value with the consent of the coordinator.

In every round there are four phases. In the rst phase each process sends its proposed value (estimate) to the coordinator (timestamped with the round number). In the second phase, the coordinator receives the estimates from non-suspected processes and then selects one of them as their new

(7)

estimate. The selected value is the estimate of a process that has the largest timestamp. In the same phase, the coordinator broadcasts its estimate. In the third phase, processes receive the value sent by the coordinator and send back either ack (acknowledgement message) if the coordinator is not suspected or otherwise nack (no acknowledgement). In the fourth phase, the coordinator waits for d(n+1)

2 e replies and if all of them are of type ack then

estimatecis locked, or otherwise it starts a new round and consequently other

processes waiting for a decision also start a new round. The only reason to send a nack message (in Phase 3) is having suspicion (due to failure detector) for the coordinator. However, if all of the d(n+1)

2 e acknowledgements (ack

type messages) are received, then the coordinator decides the locked value and broadcasts it through a channel, called R-broadcast. Every process p in this protocol executes the following algorithm [1] where the parameter vp

(8)

Algorithm 2 Process(vp)

estimatep:= vp{estimatepis estimated decision value of p}

statep:= undecided

rp:= 0{rpis p's current round number}

tsp:= 0{tsp is the last round in which p updated estimatep}

{Rotate through coordinators until decision is reached} while statep= undecideddo

rp:= rp+ 1

cp:= (rpmod n) + 1{cpis the current coordinator}

Phase 1: {All processes p send estimatepto the current coordinator}

send (p, rp, estimatep, tsp)to cp

Phase 2: {The current coordinator gathers d(n+1)

2 e estimates and proposes a new

estimate} if p = cpthen

wait until [for d(n+1)

2 eprocesses q : received (q, rp, estimateq, tsq)from q]

msgsp[rp] := {(q, rp, estimateq, tsq) | preceived (q, rp, estimateq, tsq) from q}

t :=largest tsqsuch that (q, rp, estimateq, tsq) ∈ msgsp[rp]

estimatep:=select one estimateq such that (q, rp, estimateq, t) ∈ msgsp[rp]

send (p, rp, estimatep)to all

end if

Phase 3: {All processes wait for the new estimate proposed by the current coordina-tor}

wait until [received (cp, rp, estimatecp)from cpor cp∈ Dp]

if [received (cp, rp, estimatecp)from cp] then {p received estimatecp from cp}

estimatep:= estimatecp

tsp:= rp

send (p, rp, ack)to cp

else

send (p, rp, nack)to cp{p suspects that cp crashed}

end if

Phase 4: {The current coordinator waits for d(n+1)

2 ereplies. If they indicate that

d(n+1)₂ e processes adopted its estimate, the coordinator R-broadcasts a decide mes-sage}

if p = cpthen

wait until [for d(n+1)

2 eprocesses q : received (q, rp, ack) or (q, rp, nack)

if [for d(n+1)

2 eprocesses q : received (q, rp, ack)] then

R-broadcast (p, rp, estimatep, decide){reliable broadcast}

end if end if end while

{if p R-delivers a decide message, p decides accordingly} when R-deliver (q, rq, estimateq, decide)

if statep= undecidedthen

decide (estimateq)

statep:= decided

(9)

3 Formal Specication

In this section, we discuss the formalization of the consensus algorithms, given in Sections 2.2 and 2.3, respectively. We use mCRL2 [6] as our formal specication language. We need some data types, functions and operators to specify the behaviour of the protocols in terms of communication channels, failure detectors and the dierent phases of the protocols. In the formal specication of both algorithms, we use a separate channel for every type of message in every round to entertain asynchrony with respect to communica-tion channels. So there is no bound on message delays and a message sent in a previous round can reach its destination after a message of the current round.

3.1 Data types

We use the built-in support for data types in mCRL2 like; B (for Boolean, i.e., true or false), Z (for integers) and N (for natural numbers). The toolset denes both Z and N as unbounded, i.e., there is no largest number in these data types (and no smallest for Z). The toolset also provides many data structures, we use one of them, called List, to handle homogeneous data, e.g., estimates, msgs, lastMsgs etc.

3.2 Consensus with strong completeness and weak accuracy Before discussing the formalization details of the protocol, we present all auxiliary functions, which are dened in the form of rewrite rules. Function types are used to dene customized transformations on (a combination of) abstract data types. We dene the following customized functions where key-words map, var and eqn in mCRL2 are used for function signature, variable declaration and function denition (in terms of equations), respectively.

• minus: To subtract a list from another, e.g., if A and B are two lists of natural numbers then minus(A, B) is also a list having all such elements of A which do not belong to B. This denition is formally specied as:

map

minus : List(N) × List(N) → List(N); eliminate : List(N) × N → List(N);

{to eliminate the rst occurrence of a value from the list} var

ln, lg : List(N); m, n : N;

(10)

eqn

minus([], lg) = []; {[] is an empty list} minus(ln, []) = ln;

minus(n B ln, m B lg) =

if(m ∈ n B ln, minus(eliminate(n B ln, m), lg), minus(n B ln, lg)); {B is the operator to insert an element at the head of a list} eliminate(n B ln, m) = if(n ≈ m, ln, n B eliminate(ln, m));

• makeIdentical: This function makes two lists (of the same size) iden-tical by replacing every element that appears in one but not in the other with ⊥ (used for null value) at each location. In Phase 2, pro-cesses exchange their lists of values and using this function make them identical.

map

makeIdentical : List(N) × List(N) → List(N); var ln : List(N); x, n : N; eqn makeIdentical([], ln) = ln; makeIdentical(ln, []) = []; makeIdentical(x B lg, n B ln) =

if(x ≈⊥, ⊥ BmakeIdentical(lg, ln), n B makeIdentical(lg, ln)); • ndDecided: This function nds the rst available non-⊥ value from a

list. Each process uses this function in Phase 3 to decide a value. map ndDecided : List(N) → N; var ln : List(N); n : N; eqn ndDecided([]) =⊥;

ndDecided(n B ln) = if(n 6≈⊥, n, ndDecided(ln));

• updateDelta: ∆ is the list used in every round of Phase 1 to send the proposed value to all other processes. After sending ∆, each process initializes it with ⊥ and then updates it with the values received in the current round but not in the previous rounds. To update the data values in this list, the function updateDelta is used. This function is only dened when the three lists have the same size.

(11)

map

updateDelta : List(N) × List(N) × List(N) → List(N); var lg, ln, ld : List(N); x, n, m : N; eqn updateDelta([], lg, ln) = []; updateDelta(n B lg, m B ln, x B ld) =

if(m 6≈ n, m B updateDelta(lg, ln, ld), x B updateDelta(lg, ln, ld)); • updateMsgs: In phases 1 and 2 processes use two lists msgs and

lastmsgsrespectively to store the lists of other processes. This func-tion helps the processes to store a list at a particular locafunc-tion.

map

updateMsgs : N × List(List(N)) × List(N) → List(List(N)) var lg, ln : List(N); n : N; msgs : List(List(N)); eqn updateMsgs(⊥, lg B msgs, ln) = ln B msgs; updateMsgs(⊥, [], ln) = [ln]; (n > 0) →updateMsgs(n, lg B msgs, ln) = lg B updateMsgs(Int2N at(n − 1), msgs, ln);

{Int2Nat function determines the natural number of an integer value}

• updateCrashed: Failure detectors use this function to add a crashed process in the list of suspects.

map

updateCrashed : List(N) × N → List(N); var

ln : List(N); n : N

eqn

updateCrashed(ln, n) = if(n ∈ ln, ln, n B ln);

Next we discuss the process denitions which specify the behaviour of every participant in the protocol.

3.2.1 The process for failure detectors:

A failure detector provides a list of suspected processes whenever a process requires it. In [1], the behaviour of a failure detector is dened in terms of abstract properties. In accordance to these properties, we devise one pro-cess to represent the failure detectors of all propro-cesses as shown in Figure 1,

(12)

where the processes query the failure detector and get the list of suspects. To get the reduced state space, we instantiated this process once and allowed its interaction with other processes in the network where the processes also communicate with each other in dierent phases and rounds. This process

p2

p3

Failure detector p1

Figure 1: Failure detector used in the model for Algorithm 1, where π = {p1, p2, p3}

eventually realizes the strong completeness property when a crashed process is permanently added in the list of suspects. Each process can query this pro-cess like communicating with the local failure detector. This failure detector is unreliable, so by mistake it can include correct processes (except one, when it satises weak accuracy) among the suspected processes. The property of weak accuracy is implemented in the process for Phase 1 (discussed in Sec-tion 3.2.2) to reduce the state space. Initially, it does not care about strong completeness but non-deterministically at any point (afterwards), it provides the complete list of crashed process. We dene this process by means of a parameter, i.e., crashed:

• crashed : List(N): The list of the crashed processes, i.e., sent as a reply to the querying process. In the start this list is empty but eventually it contains every crashed process.

(13)

1: FD(crashed : List(N)) = 2: X id:N rcv_addRequest(id).FD(updateCrashed(crashed, id)) 3: + 4: X p:π send_list(crashed, p).FD(crashed)

The name of the process for the failure detector is FD as shown in line 3.2.1 with one parameter. We implemented the eventuality with the help of a process, called CrashedProc. CrashedProc is a simple process (not dened here but given in appendices 1 and 2) where a participant can send a mes-sage to the failure detector to add its ID to the list of crashed failures. It notices the process crashing and then continuously pings the failure detector until the ID of the crashed failure is added in the list of suspects. Once the list with respect to a particular process is updated then afterwards the failure detector permanently declares this process as suspected but the time between the crash and the permanent suspicion is not xed. FD has two non-deterministic choices; updating a list of crashed processes and replying the query of a process, which are shown in lines 3.2.1 and 3.2.1, respectively. So eventually each crashed process becomes part of the list called crashed, hence we can say that the given failure detector satises the property of strong completeness.

3.2.2 The process for Phase 1:

We dene this process with the help of following six parameters: • myId:N: The ID-number of the process.

• round:N: Every process executes n − 1 asynchronous rounds and this parameter denotes the current round number. In every round, each process p waits for the message of each correct process q, if q is not suspected.

• List(N): The list that contains the proposed values of all non-suspected processes.

• ∆ : List(N): The list to exchange the proposed values, as discussed in Section 3.2.

• msg : List(List(N)): A two-dimensional list to store the messages of every process in each round.

• msg_sent : B: In every round a process sends its message and then waits without sending the next message. This parameter is used to keep this sequence.

(14)

In the following denition we assume the existence of a process Correct that remains operational and never gets suspected where Correct ∈ π.

1: Phase1(myId, round : N, V, ∆ : List(N),

msgs : List(List(N)), msg_sent : B) =

2: (myId 6≈Correct) →

crashed(myId) · CrashedProc(myId, false, false, false, false)

3: +

4: (round ≤ N − 1) → ((¬msg_sent) → send2all(round, ∆, myId)·

Phase1(myId, round, V, ∆, msgs, true)

5:

6: X

lst:List(N)

queryFD(lst, myId)·

WaitandReceive(myId, round, V, ∆, msgs, minus(π, lst))

7: )

8: P hase2(myId, V, [],false);

9: W aitandReceive(myId, round : N, V, ∆ : List(N), msgs : List(List(N)), from : List(N)) =

10: (#from > 0) → ( 11: X p:π (p ∈from) → X ∆q:List(N) receive(round, ∆q, p, myId)·

12: (suspected(myId, p,false) · WaitandReceive(myId, round, V, [⊥, ⊥, ⊥], updateM sgs(p, msgs, ∆q), minus(from, [p]))

13: +

14: (p 6≈Correct) → suspected(myId, p, true)·

WaitandReceive(myId, round, V, [⊥, ⊥, ⊥], msgs, minus(from, [p]))

15: )

16: +

17: rcv_stopWaiting(p) · WaitandReceive(myId, round, V,

[⊥, ⊥, ⊥],msgs, minus(from, [p]))

18: )

19:

20: Phase1(myId, round + 1, update_V(V, msgs),

updateDelta(V, update_V(V, msgs), [⊥, ⊥, ⊥]), msgs, false);

The above denition shows that a process in Phase 1, can crash or can send a message to others as shown in lines 3.2.2 and 3.2.2, respectively. WaitandReceive is another process, dened in line 3.2.2, used to wait until a process receives all current round message from non-suspected processes. While waiting if it learns from the failure detector that some correct process q has crashed and q ∈ Dp, it stops waiting for the respective message as

(15)

shown in line 3.2.2. The process WaitandReceive has the same parameters like the process Phase1, except a list called from. Initially, this list is equal to the non-suspected processes, i.e., π − suspects and upon receiving a message from an arbitrary process, say p, it is updated as from := from−[p]. It is clear from the informal specications of Algorithm 1, that a process p is interested to get the list of suspects and to know whether some process q belongs to Dp

or not whenever p receives a message from q. So a process in Phase 1 always has two non-deterministic choices (suspect or unsuspect) for a process that is sending messages. If the last argument in an action suspected (given in lines 3.2.2 and 3.2.2) is true then the sender of the message is suspected, so its sent message is discarded. Whereas the value false in the same action points to non-suspicion and thus the list ∆q is added to msgs using a function,

called updateMsgs. The condition given in line 3.2.2 takes into account a correct process that is never suspected. The empty list (called from) in line 3.2.2 shows that there is no process to wait for, so every process moves to Phase 1.

3.2.3 The process for Phase 2

The process in Phase 2 uses three parameters of Phase 1 (myId, round and V) and a list, called lastmsgs to store the lists of other processes.

1: P hase2(myId : N, V : List(N), lastmsgs : List(List(N)), V_sent : B) =

2: (myId 6≈Correct) → send_crashed(myId) · CrashedP roc(myId)

3: +

4: (¬V_sent) → send2all(0, V, myId) · P hase2(myId, V, lastmsgs, true)

5:

6: X

lst:List(N)

queryFD(lst, myId)·

WaitandReceive2 (myId, V, lastmsgs, minus(π, lst))

7: WaitandReceive2(myId : N, V : List(N), lastmsgs : List(List(N)),

from : List(N)) =

8: (#f rom > 0) →P

q:N

P

Vq:List(N)receive(Vq, q, myId)·

9: WaitandReceive2(myId, V, updateMsgs(q, lastmsgs, Vq),

minus(from, [q]))

10:

11: Phase3(myId, updateLastmsgs(lastmsgs, V ));

In this phase, a process has a choice to crash if it is not the correct process (as it has a possibility of erroneous suspicion by the failure detector). The second choice, shown in line 3.2.3, is to rst send the list of values and

(16)

then receive from all non-suspected correct processes. Line 3.2.3 shows that process queries the failure detector before waiting and then waits by initiating a process called WaitandReceive2 dened in line 3.2.3. Every participant in this process receives the list of proposed values from other processes and then moves to Phase 3 after making its list similar to others.

3.2.4 The process for Phase 3:

The process for Phase 3 is very simple. Each participant decides the rst non-⊥ value from its list of available proposed values. The process for Phase3 takes two parameters, the process ID and the list of values which has been already updated in Phase 2. The denition of this process is:

1: P hase3(myId : N, V : List(N)) = decide(myId, ndDecided(V ))

The above specication shows that each process in Phase 3, decides a value (non-⊥) from the proposed values and then stops.

3.3 Consensus with strong completeness and eventual weak accuracy

The specication settings for this protocol use the functions discussed in Section 3.2. In this protocol dierent message types are sent and received in dierent phases. For example, in Phase 1, processes send their estimates, in Phase 3 acknowledgement messages (ack or nack) are communicated and in Phase 4 either they receive the decided value or start the next round. So we dene dierent channels according to their message types. In this protocol, at a time, only the coordinator is either a source or destination of every mes-sage, i.e, other processes send their messages to the coordinator and receive messages from the coordinator only. To realize eventual weak accuracy, we dene the following processes with the assumption that Correct ∈ π is one of the correct processes that is never suspected after a certain amount of time. 3.3.1 The process for failure detector

In this protocol the majority of the processes remains correct and we im-plement this property with the help of a failure detector. It keeps track of the number of crashes (f) and guarantees that f < d(n+1)

2 e. There are three

parameters used in the denition;

• crashed:List(N): A list to store the ID-number of the crashed process. • totalCrashed:N: To keep track of the number of crashes.

• weaklyAccurate:B: To determine whether the failure detector satises weak accuracy or not.

(17)

1: FD(crashed : List(N), totalCrashed : N, weaklyAccurate : B) =

2: (totalCrashed ≈ 0) →X

id:N

rcv_crashed(id)·

FD(crashed, totalCrashed + 1, weaklyAccurate)

3: +

4: X

id:N

rcv_addRequest(id)·

F D(updateCrashed(crashed, id), totalCrashed, weaklyAccurate)

5: +

6: (¬weaklyAccurate) → weakAccuracy.F D(crashed, totalCrashed, true)

7: + 8: (weaklyAccurate) → X round:N X p:π replyQuery(crashed, p, round) 9: 10: X round:N X p:π (replyQuery(crashed, p, round) 11: +

12: replyQuery(Addcrashed([Correct], crashed), p, round) ) ·FD(crashed, totalCrashed, weaklyAccurate); In line 3.3.1, the failure detector determines the number of already crashed processes. If they are less than N

2 (i.e., equal to 0, if N=3) and any other

process crashes in the meanwhile then the counter for crash failures increases without immediately adding such process to the crashed processes. To meet the property of strong completeness, a crashed process is eventually added to the crashed processes as shown in line 3.3.1. In the same way, the weak accuracy is also eventual, so non-deterministically at some point the failure detector becomes weakly accurate (line 3.3.1), i.e., from on, it will not con-sider a particular correct process as crash failure (line 3.3.1). Otherwise, due to unreliability of the failure detector, it can send a list of crashed processes containing a correct process as shown in line 3.3.1.

It is assumed that every sent message will be eventually delivered but the protocol specication gives us no information about a message that is sent from a process and the only recipient, i.e., the coordinator crashes before receiving it. Due to the asynchronous behaviour of the distributed system, the delays in channels are unbounded and there is no guarantee that messages will be delivered in the same order in which they are sent. To alleviate this problematic situation, we modeled the process for Phase 1 in a way that every process uses a separate channel for a message in each round. In this way the algorithm demonstrates the asynchronous behaviour. But to reach the terminated state, a process can go through several asynchronous rounds

(18)

[1], so we modeled the Phase 1 in a manner that if the algorithm does not terminate in N rounds (N is the number of processes) then the round number is reset to its initial value, shown in line 3.3.2. In every round, there is a new coordinator. So, the recipient varies with respect to round number. We dene this process by means of four parameters, myId, round, estimate and ts where ts is the last round number in which a process has updated its estimate (default is 0).

1: P hase1(myId, round, estimate, ts : N) =

2: (round ≤ N ) → send(1, myId, round, estimate, ts) · P hase2(myId, round, estimate, ts, π, 0)

send(1, myId, 0, estimate, ts) · P hase2(myId, 0, estimate, ts, π, 0)

3: +

4: (myId 6≈ Correct) → send_crashed(myId) ·

CrashedP roc(myId, round,minus(π, [myId]), false)

Every process initiates this phase from Phase 1 but only the coordinator executes it and the rest of the processes jump to Phase 3. This phase is formally specied as:

1: P hase2(myId, round, estimate, ts : N, from : List(N), i : N) =

2: (myId 6≈Correct) → send_crashed(myId)·

Crashed(myId, round, minus(π, [myId]),false)

3: +

4: ((round mod N ) + 1 ≈ myId && #from > 0) → 5: ((i < (N + 1) div 2) →

6: P

q,estimateq,tsq:Nrcvfrom(1, q, round, estimateq, tsq, myId)·

7: P hase2(myId, round, updateEstimate(estimate, estimateq, ts, tsq),

isGreater(ts, tsq), minus(from, [q]), i + 1)

8:

9: send(2, myId, round, estimate, ts)·

P hase3(myId, round, estimate, ts)

10: )

11:

12: P hase3(myId, round, estimate, ts);

Line 3.3.3 shows that a process can crash if it is not a process due to which this protocol satises weak accuracy. In line 3.3.3, the coordinator waits for at least d(n+1)

2 eprocesses. If a process q sends its message such that tsq> tsc,

(19)

specically dened function, called updateEstimate, shown in line 3.3.3. After receiving the messages from the majority, the coordinator broadcasts its estimate and proceeds for Phase 3, as shown in line 3.3.3.

3.3.4 The process for Phase 3 We dene the process for Phase 3 as:

1: P hase3(myId, round, estimate, ts : N) =

2: (myId 6≈Correct) → send_crashed(myId)·

Crashed(myId, round, minus(π, [myId]))

3: +

4: rcv_CF ailure(myId, round) · P hase1(myId, round + 1, estimate, ts)

5: +

6: X

estq,tsq:N

rcvf rom(2, (round mod N ) + 1, round, estq, tsq, myId)·

7: X

lst:List(N)

rcv_list(lst, myId, round)·

8: ((round mod N ) + 1 ∈ lst) →

send3(myId, round, nack, (round mod N ) + 1)· P hase4(myId, round, estimate, ts, 0, π)

9:

10: send3(myId, round, ack, (round mod N ) + 1) ·P hase4(myId, round, estq, tsq, 0, π);

Crashing of any process at this phase is shown in line 2, whereas line 4 shows the crashing of coordinator and if this happens then every process restarts Phase 1 with the next round number. According to round number, the new coordinator is designated and the other processes send their estimates to the current coordinator. If both the process and the coordinator are not crashed then the process receives the estimate of coordinator (line 3.3.4) and quires the failure detector (line 3.3.4) to send either ack or nack. The message ack, if coordinator is not in the list of suspects(line 3.3.4) otherwise the message nack is sent as a reply (line 3.3.4).

In this phase either all of the processes including the coordinator agree upon a value or move to the next round. We dene the process with two extra parameters from Phase 3; i : N and from : List(N). The rst one is used for counting the received messages and second one (initially π) is used to receive one message from each process.

(20)

1: P hase4(myId, round, estimate, ts, i : N, f rom : List(N)) =

2: (myId 6≈ Correct) → send_crashed(myId)·

Crashed(myId, round, minus(π, [myId]),false)

3: +

4: ((round mod N ) + 1 ≈ myId) →

5: ( (i < (N + 1) div 2) →

6: (X

q:N

X

msg_type:Ack_T ype

rcvAckN ack(q, round, msg_type, myId)·

7: (msg_type ≈ ack) →

8: P hase4(myId, round, estimate, ts,

i + 1, minus(f rom, [q]))

9:

10: StartN extRound(myId, round, estimate,

ts, minus(f rom, [q]))

11: )

12: sendDecision(myId, estimate, true)·

decide(myId, estimate)·δ{δ denotes the deadlock}

13: )

14: W ait4decision(myId, round, estimate, ts,false, false);

15:

16: W ait4decision(myId, round, estimate, ts : N, decided, nish : B) =

17: waiting4decision(myId)·

18: (rcv_CF ailure(myId, round) · P hase1(myId, round + 1, estimate, ts)

19: +

20: X

v:N

X

done:B

rcvDecisioF rom(v, done, myId) · (done) → decide(myId, v).δ

21:

22: P hase1(myId, round + 1, estimate, ts)

);

The option for a process to crash is shown in line 3.3.5 and line 4 shows that it waits for d(n+1)

2 e messages if it is a coordinator. If a majority send

ack messages, t`he coordinator decides and sends the decided value to all processes as shown in line 3.3.5 and respective channel ensures that this decided value is delivered.

4 General Requirements

The general requirements of a consensus problem given in [1] are:

(21)

R2. Uniform Integrity: Each process decides at most once.

R3. Termination All correct processes eventually decide on some value. R4. Uniform Validity If a process decides on value v, then v has been

proposed by some process.

4.1 Requirement specication in the µ-calculus

In order to verify the requirements with respect to the formalization, they are specied in the modal µ-calculus ([7], extended with data-dependent processes and regular formulae).

R1. According to uniform agreement in [9] any two processes always de-cide the same value, i.e., the decision of all processes is unanimous [1, 8]. We devise the following formula for any two processes p, p0 _{∈ π}_,

to ensure that their decided values cannot be dierent. Assume that V is the set of all values.

∀_v,v0_∈V∀_p,p0_∈π[true∗· decide(p, v) · true∗· decide(p0, v0)](v = v0) R2. The following formula species for each process p, the action decide(p, v),

for any arbitrary value v appears at most once in each trace. This in turn guarantees uniform integrity.

∀p∈π, ∀v ,v0_∈V[true∗· decide(p, v).true∗· decide(p, v0)]false R3. Termination of a process can be viewed in two dierent scenarios;

crashed and correct. If a process is crashed before reaching the last phase, according to both Algorithms 1 and 2, it cannot decide a value. On the other hand, if it remains correct throughout the execution, it eventually decides a value provided that the respective failure detector satises certain properties regarding accuracy and completeness. This requirement for Algorithm 1 is expressed in the µ-calculus as follows:

∀_p∈π µX · ([crashed (p) ∧ (∀v∈Vdecide(p, v ) )]X∧ < true > true)

Where p ∈ π and V is the set of proposed values. This formula states that either the action crash or decide must unavoidably be taken. The formula does not speak about strong completeness because according to LEMMA 5 in [1] Algorithm 1 is blocked forever if a process p is waiting for a message from a crashed process q and q 6∈ Dp, i.e., no

strong completeness. According to the specication in [1], there is a time after which Dp satises strong completeness, i.e., q ∈ Dp, hence

(22)

where the property of eventual weak accuracy is also mandatory but the time required for its adoption by the failure detector is not xed. To handle this eventuality, we introduce an action for the failure detector, called weakAccuracy (discussed in Section 3.3.1) to determine whether the failure detector is weakly accurate or not. As soon as it satises this property, every non-crashed process is supposed to either reach to a decision or crash. So, for Algorithm 2, we express this requirement in µ-calculus as:

∀_p∈π[(crashed (p) ∧ (∀v∈Vdecide(p, v ) ))∗.weakAccuracy]

µX · ([crashed (p) ∧ (∀v∈Vdecide(p, v ) )]X∧ < true > true)

R4. In Phase 1 of both Algorithms 1 and 2, every correct process proposes a value and in the last phase, it decides a value. According to this requirement, the decided value can only be a proposed value by some participant. The formalization of this requirement in the µ-calculus is:

∀_p∈π, ∀v∈V[(∀p0_∈πsend(p0, v))∗· decide(p, v)]false 4.2 Verication results

To verify whether the above-mentioned requirements are satised or vio-lated, we use the Evaluator model checker (version 1.5) of the CADP toolset [2, 5] and found that both protocols meet all of these requirements. Model checking was done for three number of processes and we use Pentium Dual Core (1.8 GHz) machine with 2 GB of RAM. The amount of time spent on the verication of each property is reported in Table 1. We use strong bisim-ulation reduction technique to reduce the size of the state space, hence the time mentioned in Table 1 also includes this reduction time. The following commands in given sequence make the results available where the INFILE contains formal specication and the FORMULA le contains a µ-calculus formula.

1. mcrl22lps -v -D INFILE.mcrl2 OUTFILE.lps

To translate an mCRL2 process specication from INFILE.mcrl2 to a linear process specication (LPS), to be stored in the le named, OUTFILE.lps. The option v (verbose) displays the short intermediate messages while the option D (delta) is necessary to enforce the un-timed semantics of mCRL2 (i.e., to allow for arbitrary time steps in all reachable states).

2. lpsconstelm -v OUTFILE.lps temp.lps

To reduce the linear process specication by removing spurious con-stant process parameters from the OUTFILE.lps and write the result to temp.lps.

(23)

3. lpssumelm -v temp.lps OUTFILE.lps

To remove superuous summations from the temp.lps and write the result to OUTFILE.lps.

4. lpsparelm -v OUTFILE.lps temp.lps

To remove unused parameters from the OUTFILE.lps and write the result to temp.lps.

5. lps2lts -v -ftree temp.lps OUTFILE.svc

To generate a labelled transition system (LTS) from the temp.lps and write the result to OUTFILE.svc. The option ftree is used to store state internally in tree format for ecient usage of memory.

6. ltsconvert -ebisim -v OUTFILE.svc OUTFILE.aut

To convert the labelled transition system (LTS) in OUTFILE.svc to OUTFILE.aut after applying the modulo strong bisimilarity as min-imisation method.

7. bcg_io OUTFILE.aut OUTFILE.bcg

To convert graphs from OUTFILE.aut into the Binary Coded Graphs (BCG) format, which is the input format for CADP toolset.

8. bcg_open OUTFILE.bcg evaluator -verbose -bfs -diag FORMULA.mcl To diagnose that whether the formula given in FORMULA.mcl satis-ed or not. In case it is refuted then a trace showing the counter example is displayed due to the option diag where the option bfs is used for breadth rst search.

Algorithm 1 Algorithm 2

Time to generate state space 9h54m0s 1h37m0s Number of states 1507990 45329

R1 12m13.470s 0m22.013s

R2 12m4.160s 0m22.135s

R3 7m17.847s 0m9.490s

R4 0m5.573s 0m0.315s

Table 1: Time required for the verication using the CADP toolset We also apply another tool for model-checking, called PBES2Bool (ver-sion June 2009), which is part of the mCRL2 toolset and give the required amount of time for the verication in Table 2. The advantage of this tool, compared to the Evaluator tool, is that it does not require generation of state space and the time required for the verication of each individual re-quirement is less than the time needed to both generate the state space and verify the same requirement in CADP, shown in Table 2. However, the total

(24)

time for the verication of all the requirements is little bit longer: namely 1h38m24.304s for PBES2Bool vs 1h37m53.953s for generating state-space plus modelchecking in CADP. We could verify the requirements only for Al-gorithm 2 with n = 3 because of its smaller number of transitions. To get the results we use the following commands in the given order after generat-ing linear process specication in temp.lps le (after step 4 given above) and specify µ-calculus formulae in FORMULA.mcf le.

1. lps2pbes -f FORMULA.mcf temp.lps OUTFILE.pbes

To convert the state formula in FORMULA.mcf and the LPS in temp.lps to a parameterized boolean equation system (PBES) and save it to OUTFILE.pbes.

2. pbesparelm -v temp.pbes OUTFILE.pbes

To apply parameter elimination on temp.pbes and write it to OUT-FILE.pbes.

3. pbes2bool -vprjittyc OUTFILE.pbes -s1

To solve the parameterized boolean equation system (PBES) in OUT-FILE.pbes. The option vprjittyc is combination of multiple abbrevi-ations; v to display short intermediate messages, p to precompile the pbes for faster rewriting and r to use the rewrite strategy, called jittyc [10]. Algorithm 2 R1 40m58.730s R2 42m15.587s R3 14m59.494s R4 0m10.493s

Table 2: Time required for verication using the mCRL2 toolset

5 Conclusions

In fault-tolerant distributed systems, the consensus problem plays a funda-mental role [9]. In the consensus problem, every process proposes a value and if it remains non-crashed during execution then it eventually decides a value with the property that the decision is irrevocable and unanimous [8]. Consensus cannot be solved in asynchronous distributed systems with crash failures [4]. Hence to implement consensus, participating processes rely on a notion of the failure detector. A failure detector is called perfect, if it never suspects a correct process but eventually suspects every crashed process. In asynchronous systems, it is impossible to devise a perfect failure detector be-cause it cannot dierentiate between a crashed failure and a slow process. In

(25)

[1], unreliable failure detector are introduces to solve the consensus problem in an asynchronous system with crash failures provided that they satisfy the properties of completeness and accuracy.

In this paper, we formalized two distributed algorithms for the consensus problem with their requirements. Our verication shows that all of the requirements are satised by both algorithms. We presented our approach for specication of the protocols in the mCRL2 syntax and the requirements in the modal µ-calculus. We devised a common failure detector that satises weak accuracy and strong completeness (or eventual strong completeness). We model-checked the behaviour of the protocols with three participating process.

Acknowledgements

The author would like to thank MohammadReza Mousavi, Jan Friso Groote and Muhammad Rizwan Asghar for reviews and valuable comments.

References

[1] Tushar Deepak Chandra and Sam Toueg. Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225267, 1996.

[2] Jean-Claude Fernandez, Hubert Garavel, Alain Kerbrat, Laurent Mounier, Radu Mateescu, and Mihaela Sighireanu. Cadp - a protocol validation and verication toolbox. In CAV, pages 437440, 1996. [3] Michael J. Fischer. The consensus problem in unreliable distributed

systems (a brief survey). In Marek Karpinski, editor, FCT, volume 158 of Lecture Notes in Computer Science, pages 127140. Springer, 1983. [4] Michael J. Fischer, Nancy A. Lynch, and Mike Paterson. Impossibility

of distributed consensus with one faulty process. J. ACM, 32(2):374 382, 1985.

[5] Hubert Garavel, Frédéric Lang, Radu Mateescu, and Wendelin Serwe. CADP 2006: A Toolbox for the Construction and Analysis of Dis-tributed Processes. In Werner Damm and Holger Hermanns, editors, Computer Aided Verication (CAV'2007) Lecture Notes in Computer Science, volume 4590 of Lecture Notes in Computer Science, pages 158 163, Berlin Germany, 2007.

[6] Jan Friso Groote, Aad Mathijssen, Muck van Weerdenburg, and Yaroslav S. Usenko. From µCRL to mCRL2: motivation and outline. Electr. Notes Theor. Comput. Sci., 162:191196, 2006.

(26)

[7] Dexter Kozen. Results on the propositional mu-calculus. Theor. Com-put. Sci., 27:333354, 1983.

[8] Ajay D. Kshemkalyani and Mukesh Singhal. Distributed Computing. Cambridge University Press, The Edinburgh Building, Cambridge CB2 8RU, UK, 2008.

[9] Gil Neiger and Sam Toueg. Automatically increasing the fault-tolerance of distributed algorithms. J. Algorithms, 11(3):374419, 1990.

[10] Muck van Weerdenburg. An account of implementing applicative term rewriting. Electron. Notes Theor. Comput. Sci., 174(10):139155, 2007.

(27)

A mCRL2 specication for consensus problem with

strong completeness and weak accuracy

This is the mCRL2 specications of the consensus problem discussed in Sec-tion 2.2.

1 map 2 3 N : N;

4 minus : List(N) × List(N) → List(N);

5 eliminate : List(N) × N → List(N);

6 update_V : List(N) × List(List(N)) → List(N);

7 removeBottom : List(N) × List(N) → List(N);

8 update_V 2phase : List(List(N)) × List(N) × N → List(N);

9 updateDelta : List(N) × List(N) × List(N) → List(N);

10 f indDecided : List(N) → N;

11 π : List(N);

12 updateM sgs : N × List(List(N)) × List(N) → List(List(N));

13 updateCrashed : List(N) × N → List(N);

14 addcrashed : List(N) × List(N) → List(N);

15 makeIdentical : List(N) × List(N) → List(N);

16 updateLastmsgs : List(List(N)) × List(N) → List(N);

17 Correct : N; 18 19 var 20 21 ln, lg, ld : List(N); 22 msgs : List(List(N)); 23 lb : List(B); 24 x, m, n, k : N; 25 s, b, p : B; 26 27 eqn 28 29 updateLastmsgs(lg B msgs, ln) =

30 if (#msgs > 0, updateLastmsgs(msgs, makeIdentical(lg, ln)), makeIdentical(lg, ln));

31 updateLastmsgs([], ln) = ln; 32 makeIdentical(ln, []) = []; 33 makeIdentical(x B lg, n B ln) = %0 i s used f o r ⊥ 34 if (x ≈ 0, 0 B makeIdentical(lg, ln), n B makeIdentical(lg, ln)); 35 N = 3; % Total Number o f p r o c e s s e s 36 π = [0, 1, 2]; %IDs o f the p r o c e s s e s

37 Correct = 2; %ID o f the c o r r e c t p r o c e s s

38 minus([], lg) = [];

39 minus(ln, []) = ln;

40 minus(n B ln, m B lg) = if (m ∈ n B ln, minus(eliminate(n B ln, m), lg), minus(n B ln, lg));

41 eliminate(n B ln, m) = if (n ≈ m, ln, n B eliminate(ln, m));

42 updateDelta([], lg, ln) = [];

43 updateDelta(n B lg, m B ln, x B ld) =

44 if (m 6≈ n, m B updateDelta(lg, ln, ld), x B updateDelta(lg, ln, ld));

45 update_V (ln, lg B msgs) =

46 if (#msgs > 0, update_V (removeBottom(ln, lg), msgs), removeBottom(ln, lg));

47 removeBottom(n B ln, k B lg) = 48 if (n ≈ 0 ∧ k 6≈ 0, k B removeBottom(ln, lg), n B removeBottom(ln, lg)); 49 removeBottom([], []) = []; 50 removeBottom([], lg) = []; 51 removeBottom(ln, []) = []; 52 update_V 2phase(msgs, [], k) = []; 53 update_V 2phase(ln B msgs, n B lg, k) = 54 if (ln.k ≈ 0, 0 B update_V 2phase(msgs, lg, k + 1), 55 n B update_V 2phase(msgs, lg, k + 1));

(28)

56 f indDecided([]) = 0;

57 f indDecided(n B ln) = if (n 6≈ 0, n, f indDecided(ln));

58 updateM sgs(0, lg B msgs, ln) = ln B msgs;

59 updateM sgs(0, [], ln) = [ln];

60 (n > 0) → updateM sgs(n, lg B msgs, ln) = lg B updateM sgs(Int2N at(n − 1), msgs, ln);

61 updateCrashed([], n) = [];

62 updateCrashed(ln, n) = if (n ∈ ln, ln, n B ln);

63 addcrashed([], []) = [];

64 addcrashed(ln, []) = ln;

65 addcrashed(ln, n B lg) = if (n ∈ ln, addcrashed(ln, lg), n B addcrashed(ln, lg));

66 67 act 68

69 send2all, rcv, broadcast : N × List(N) × N;

70 sendT o, receive, received : N × List(N) × N × N;

71 decide : N × N;

72 rcv_crashing, rcv_query : N ;

73 send_list, queryF D, getCrashedList : List(N) × N;

74 suspected : N × N × B;

75 crashed,

76 send_stopW aiting, rcv_stopW aiting, stopW aiting, strongComplete : N;

77 suspect : N × N; 78 79 proc 80 81 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 82 % Process f o r f a i l u r e d e t e c t o r 83 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

84 F D(crashed : List(N)) =P_id:Nrcv_addRequest(id).FD(update_crashed(crashed, id))

85 +

86 (send_list(crashed, 0)

87 +send_list(crashed, 1)

88 +send_list(crashed, 2)).F D(crashed);

89 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 90 % Process f o r Channel

91 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 92 Channel(myId, round : N) =

93 P_∆:List(N).rcv(round, ∆, myId).

94 randomBroadcast(round, ∆, myId, 0, π);

95

96 randomBroadcast(round : N, ∆ : List(N), myId, i : N, to : List(N)) =

97 (i < N ) → (

98 (0 ∈ to) → sendT o(round, ∆, myId, 0).

99 randomBroadcast(round, ∆, myId, i + 1, minus(to, [0]))

100 +

103 +

106 ) 107 108 Channel(myId, round); 109 110 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 111 % Process f o r Phase 1 112 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

113 % each p r o c e s s sends i t message to a l l and r e c e i v e from a l l 114 % then i t p r o c e s s e s the messages o f only not−s us pect ed p r o c e s s e s . 115

116 P hase1(myId, round : N, V, ∆ : List(N), msgs : List(List(N)), msg_sent : B) =

(29)

118 +

119 (round ≤ N − 1) → ((¬msg_sent) → send2all(round, ∆, myId).

120 P hase1(myId, round, V, ∆, msgs, true)

121

122 P_lst:List(N).queryF D(lst, myId).

123 W aitandReceive(myId, round, V, ∆, msgs, minus(π, lst))

124 )

125

126 P hase2(myId, V, [minus([0], [0]), minus([0], [0]), minus([0], [0])], f alse);

127

128 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 129 % Process f o r Wait and r e c e i v e

130 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 131

132 W aitandReceive(myId, round : N, V, ∆ : List(N), msgs : List(List(N)), f rom : List(N)) =

133 (#f rom > 0) → (

134 (0 ∈ f rom) →P_∆_q_:List(N).receive(round, ∆q, 0, myId). 135 (suspected(myId, 0, f alse).

136 W aitandReceive(myId, round, V, [⊥, ⊥, ⊥], updateM sgs

137 (0, msgs, ∆q), minus(f rom, [0]))

138 +

139 suspected(myId, 0, true).

140 W aitandReceive(myId, round, V, [0, 0, 0], msgs, minus(f rom, [0]))

141 )

142 +

143 (1 ∈ f rom) →P_∆

q:List(N).receive(round, ∆q, 1, myId).

144 (suspected(myId, 1, f alse).

145 W aitandReceive(myId, round, V, [⊥, ⊥, ⊥], updateM sgs

146 (1, msgs, ∆q), minus(f rom, [1]))

147 +

148 suspected(myId, 1, true).

149 W aitandReceive(myId, round, V, [⊥, ⊥, ⊥], msgs, minus(f rom, [1]))

150 )

151 +

152 (2 ∈ f rom) →P_∆_q_:List(N).receive(round, ∆q, 2, myId).suspected(myId, 2, f alse). 153 W aitandReceive(myId, round, V, [⊥, ⊥, ⊥],

154 updateM sgs(2, msgs, ∆q), minus(f rom, [2]))

155 +

156 (0 ∈ f rom) → rcv_stopW aiting(0).W aitandReceive(myId, round, V,

157 [⊥, ⊥, ⊥], msgs, minus(f rom, [0]))

158 +

159 (1 ∈ f rom) → rcv_stopW aiting(1).W aitandReceive(myId, round, V,

160 [⊥, ⊥, ⊥], msgs, minus(f rom, [1]))

161 )

162

163 P hase1(myId, round + 1, update_V (V, msgs),

164 updateDelta(V, update_V (V, msgs), [⊥, ⊥, ⊥]), msgs, false);

165

166 % a f t e r c r a s h i n g

167 CrashedP roc(myId : N, mt2, mt3, stronglyComplete : B) =

168 (¬stronglyComplete) → send_addRequest(myId).CrashedP roc(myId, mt2, mt3, true)

169 +

170 P_q,round:N.P∆q:List(N).

171 receive(round, ∆q, q, myId).CrashedP roc(myId, mt2, mt3, stronglyComplete) 172 +

173 % A p r o c e s s p i s crashed b e f o r e sending a message to q , and 174 % q i s w aiting because q q u e r i e d FD when p was a l i v e , so q w i l l 175 % continue to wait u n t i l p i s added to the l i s t crashed i n FD. 176 % The paramters mt2 and mt3 are to ensure the o c c u r r e n c e o f the

177 % send_stopWaiting a c t i o n only once . 178

(30)

179 (¬mt2 ∧ stronglyComplete) → send_stopW aiting(myId).

180 CrashedP roc(myId, true, mt3, stronglyComplete);

181 +

182 (¬mt3 ∧ stronglyComplete) → send_stopW aiting(myId).

183 CrashedP roc(myId, mt2, true, stronglyComplete)

184

185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 186 % Process f o r Phase 2

187 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

188 % message s e n t in round 0 means phase −2 as t h e r e i s no 189 % round in phase 2 but in phase 1 rounds are 1 to n−1 190

191 P hase2(myId : N, V : List(N), lastmsgs : List(List(N)), V _sent : B) =

192 (myId 6≈ Correct) → crashed(myId).CrashedP roc(myId, f alse, f alse, f alse)

193 +

194 (¬V_sent) → send2all(0, V, myId).P hase2(myId, V, lastmsgs, true)

195

196 P_lst:List(N).queryF D(lst, myId).

197 W aitandReceive2(myId, V, lastmsgs, minus(π, lst));

198

199 W aitandReceive2(myId : N, V : List(N), lastmsgs : List(List(N)), f rom : List(N)) =

200 (#f rom > 0) →P_q:NPV_q:List(N).receive(0, V_q, q, myId).

201 W aitandReceive2(myId, V, updateM sgs(q, lastmsgs, V_q),

202 minus(f rom, [q])) 203 204 P hase3(myId, updateLastmsgs(lastmsgs, V )); 205 206 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 207 % Process f o r Phase 3 208 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 209 P hase3(myId : N, V : List(N)) = decide(myId, f indDecided(V ));

210

211 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

212 % Process f o r Consensus

213 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 214

215 Consensus = τ{stopW aiting},

216 (∇{decide,received,broadcast,getCrashedList 217 ,crashed,stopW aiting,suspected,strongComplete}, 218 Γ({sendT o|receive→received,

219 send_list|queryF D→getCrashedList, 220 send2all|rcv→broadcast,

221 send_addRequest|rcv_addRequest→strongComplete, 222 send_stopW aiting|rcv_stopW aiting→stopW aiting},

223 P hase1(0, 1, [7, 0, 0], [7, 0, 0], [[0, 0, 0], [0, 0, 0], [0, 0, 0]], f alse) k 224 P hase1(1, 1, [0, 5, 0], [0, 5, 0], [[0, 0, 0], [0, 0, 0], [0, 0, 0]], f alse) k 225 P hase1(2, 1, [0, 0, 9], [0, 0, 9], [[0, 0, 0], [0, 0, 0], [0, 0, 0]], f alse) k 226 Channel(0, 0) k Channel(0, 1) k 227 Channel(1, 0) k Channel(1, 1) k 228 Channel(2, 0) k Channel(2, 1) k 229 F D([]) 230 )); 231 i n i t 232 Consensus;

(31)

B mCRL2 specication for consensus problem with

strong completeness and eventual weak accuracy

This is the mCRL2 specications of the consensus problem discussed in Sec-tion 2.3.

1 sort 2

3 Ack_T ype = struct ack | nack;

4 5 map 6 7 N : P os; 8 Correct : N; 9 π : List(N);

10 minus : List(N) × List(N) → List(N);

11 eliminate : List(N) × N → List(N);

12 isGreater : N × N → N;

13 updateEstimate : N × N × N × N → N;

14 addcrashed : List(N) × List(N) → List(N);

15 Addcrashed : List(N) × List(N) → List(N);

16 updateCrashed : List(N) × N → List(N);

17 18 var 19 20 ln, lg, ld : List(N); 21 msgs : List(List(N)); 22 lb : List(B); 23 x, m, n, k : N; 24 s, b : B; 25 26 eqn 27 28 N = 3; 29 Correct = 2; 30 π = [0, 1, 2]; 31 minus([], lg) = []; 32 minus(ln, []) = ln;

33 minus(n B ln, m B lg) = if (m ∈ n B ln, minus(eliminate(n B ln, m), lg), minus(n B ln, lg));

34 eliminate(n B ln, m) = if (n ≈ m, ln, n B eliminate(ln, m));

35 isGreater(n, m) = if (m > n, m, n);

36 updateEstimate(x, k, n, m) = if (m > n, k, x);

37 Addcrashed(n B ln, lg) =

38 if (n ≈ Correct, addcrashed(ln, lg), addcrashed(n B ln, lg));

39 Addcrashed([], lg) = lg;

40 addcrashed([], []) = [];

41 addcrashed(ln, []) = ln;

42 addcrashed(ln, n B lg) = if (n ∈ ln, addcrashed(ln, lg), n B addcrashed(ln, lg));

43 updateCrashed([], n) = []; 44 updateCrashed(ln, n) = if (n ∈ ln, ln, n B ln); 45 46 act 47 48 send, rcv, broadcast : N × N × N × N × N;

49 sendT o, rcvf rom, received : N × N × N × N × N × N;

50 weakAccuracy, replyQuery, rcv_list, queryF D : List(N) × N × N;

51 sendDecision, rcvDecision, DecisionBC : N × N × B × List(N);

52 rcvDecisioF rom, sendDecisionT o, DecisionRcvd : N × B × N;

53 decide : N × N;

54 send3, rcv3, SendAckN ack : N × N × Ack_T ype × N;

(32)

56 rcv_crashed, send_crashed, crashed, waiting4decision : N;

57 send_CF ailure, rcv_CF ailure, CF ailure : N × N; 58 send_addRequest, rcv_addRequest, strongComplte : N;

59 60 proc 61

62 F D(crashed : List(N), totalCrashed : N, weaklyAccurate : B) =

63 % only one p r o c e s s out o f t h r e e i s allowed to crash

64 (totalCrashed ≈ 0) →P_id:N.rcv_crashed(id).

65 F D(crashed, totalCrashed + 1, weaklyAccurate)

66 +

67 P_id:B.rcv_addRequest(id).F D(updateCrashed(crashed, id), totalCrashed, weaklyAccurate)

68 +

69 (¬weaklyAccurate) → weakAccuracy.F D(crashed, totalCrashed, true)

70 +

71 ((weaklyAccurate) → (P_round:N.replyQuery(Addcrashed(((round mod N ) + 1) B [], crashed), 0, round)

72 +

73 P_round:N.replyQuery(crashed, 0, round)

74 +

75 P_round:N.replyQuery(Addcrashed(((round mod N ) + 1) B [], crashed), 1, round)

76 +

78 +

79 P_round:N.replyQuery(Addcrashed(((round mod N ) + 1) B [], crashed), 2, round)

80 +

82 )

83

84 (

85 P_round:N.replyQuery(Addcrashed([Correct], crashed), 0, round)

86 +

88 +

90 +

92 +

94 +

96 )

97 ).F D(crashed, totalCrashed, weaklyAccurate);

98 99 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 100 % Process f o r Channels 101 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 102 103 Channel(myId, round : N) =

104 P_{estimate,ts,phase:N}.rcv(phase, myId, round, estimate, ts).

105 randomBroadcast(phase, myId, round, estimate, ts, π);

106 randomBroadcast(phase, myId, round, estimate, ts : N, T o : List(N)) =

107 (phase ≈ 2) →

108 ((#T o > 0) → (

109 (0 ∈ T o) → sendT o(phase, myId, round, estimate, ts, 0).

110 randomBroadcast(phase, myId, round, estimate, ts, minus(T o, [0]))

111 +

114 +

(33)

118 )

119 ( sendT o(phase, myId, round, estimate, ts, (round mod N ) + 1)

120 ).Channel(myId, round);

121

122 Channel4AckN ack(myId, round : N) =

123 P_{to:N,msg_type:Ack_T ype}.rcv3(myId, round, msg_type, to).

124 (sendAckN ack(myId, round, msg_type, to).Channel4AckNack(myId, round);

125

126 Channel4Decision(myId : N) =

127 P_estimate:N.P_ag:B.P_To:List(N).rcvDecision(myId, estimate,ag, T o).

128 randomBroadcastDecision(myId, estimate, f lag, T o);

129

130 randomBroadcastDecision(myId, estimate : N, f lag : B, T o : List(N)) =

131 (#T o > 0) →

132 ((0 ∈ T o) → sendDecisionT o(estimate, f lag, 0).

133 randomBroadcastDecision(myId, estimate, f lag, minus(T o, [0]))

134 +

135 (1 ∈ T o) → sendDecisionT o(estimate, f lag, 1).

137 +

138 (2 ∈ T o) → sendDecisionT o(estimate, f lag, 2).

140 ) 141 142 Channel4Decision(myId); 143 144 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 145 % Process f o r Phase 1 146 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 147

148 P hase1(myId, round, estimate, ts : N) =

149 (round ≤ N ) → send(1, myId, round, estimate, ts).P hase2(myId, round, estimate, ts, π, 0)

150

151 send(1, myId, 0, estimate, ts).P hase2(myId, 0, estimate, ts, π, 0)

152 +

153 (myId 6≈ Correct) → send_crashed(myId).Crashed(myId, round, minus(π, [myId]), false);

154

155 P hase2(myId, round, estimate, ts : N, f rom : List(N), i : N) =

156 (myId 6≈ Correct) → send_crashed(myId).Crashed(myId, round, minus(π, [myId]), false)

157 +

158 ((round mod N ) + 1 ≈ myId ∧ #f rom > 0) →

159 ((i < (N + 1) div 2) →

160 Pq,estimateq,tsq:N.

161 rcvf rom(1, q, round, estimate_q, ts_q, myId).

162 P hase2(myId, round, updateEstimate(estimate,

163 estimateq, ts, tsq), isGreater(ts, tsq), 164 minus(f rom, [q]), i + 1

165 )

166

167 send(2, myId, round, estimate, ts).

168 P hase3(myId, round, estimate, ts)

169 )

170

171 P hase3(myId, round, estimate, ts);

172

173 % locked value i s r e c e i v e d from c o o r d i a t o r and 174 % ack or nack i s s e nt back .

175 P hase3(myId, round, estimate, ts : N) =

176 (myId 6≈ Correct) → send_crashed(myId).Crashed(myId, round, minus(π, [myId]))

177 +

(34)

179 +

180 Pestq,tsq:N.rcvf rom(2, (round mod N ) + 1, round, estq, tsq, myId).

181 P_lst:List(N).rcv_list(lst, myId, round).

182 ((round mod N ) + 1 ∈ lst) → send3(myId, round, nack, (round mod N ) + 1).

183 P hase4(myId, round, estimate, ts, 0, π)

184

185 send3(myId, round, ack, (round mod N ) + 1).

186 P hase4(myId, round, estq, tsq, 0, π); 187

188 P hase4(myId, round, estimate, ts, i : N, f rom : List(N)) =

189 (myId 6≈ Correct) →

190 send_crashed(myId).Crashed(myId, round, minus(π, [myId]), false)

191 +

192 ((round mod N ) + 1 ≈ myId) →

193 ((i < (N + 1) div 2) →

194 (P_q:N.P_msg_{_type:Ack_T ype}.rcvAckN ack(q, round, msg_type, myId).

195 (msg_type ≈ ack) →

196 P hase4(myId, round, estimate,

197 ts, i + 1, minus(f rom, [q]))

198

199 StartN extRound(myId, round, estimate,

200 ts, minus(f rom, [q]))

201 )

202

203 sendDecision(myId, estimate, true, minus(π, [myId])).

204 decide(myId, estimate).δ

205 )

206

207 W ait4decision(myId, round, estimate, ts, f alse, f alse);

208

209 StartN extRound(myId, round, estimate, ts : N, f rom : List(N)) =

210 (#f rom > 0) →P_msg_{_type:Ack_T ype}.

211 ((0 ∈ f rom) → (rcvAckN ack(0, round, msg_type, myId)

212 +

213 rcv_discardW aiting(0, myId)

214 ).

215 StartN extRound(myId, round, estimate, ts, minus(f rom, [0]))

216 +

217 (1 ∈ f rom) → (rcvAckN ack(1, round, msg_type, myId) 218 +rcv_discardW aiting(1, myId)

219 ).StartN extRound(myId, round, estimate

220 , ts, minus(f rom, [1]))

221 +(2 ∈ f rom) → (rcvAckN ack(2, round, msg_type, myId)

222 +rcv_discardW aiting(2, myId) 223 ).StartN extRound(myId, round, estimate

224 , ts, minus(f rom, [2]))

225 )

226

227 sendDecision(myId, estimate, f alse, minus(π, [myId])).

228 P hase1(myId, round + 1, estimate, ts);

229

230 W ait4decision(myId, round, estimate, ts : N, decided, f inish : B) =

231 waiting4decision(myId).(

232 rcv_CF ailure(myId, round).P hase1(myId, round + 1, estimate, ts)

233 +

234 P_v:N.P_done:B.rcvDecisionF rom(v, done, myId)

235 .(done) → decide(myId, v).δ

236

237 P hase1(myId, round + 1, estimate, ts));

238 239

(35)

240%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 241 % Crashed Process 242%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 243

244 Crashed(myId, round : N, ls1 : List(N), stronglyComplete : B) =

245 (¬stronglyComplete) → send_addRequest(myId).Crashed(myId, round, ls2, true)

246 ((round mod N ) + 1 ≈ myId ∧ #ls1 > 0) → (

247 (stronglyComplete) → (send_CF ailure(0, round).Crashed(myId, round,

248 minus(ls1, [0]), stronglyComplete)

249 +send_CF ailure(1, round).Crashed(myId, round, minus(ls1, [1]), stronglyComplete)

250 +send_CF ailure(2, round).Crashed(myId, round, minus(ls1, [2]), stronglyComplete)

251 +P_q:N.summsg_type : Ack_T ype.

252 rcvAckN ack(q, round, msg_type, myId).Crashed(myId, round, ls1, stronglyComplete)

253 )) (

254 (stronglyComplete) → send_discardW aiting(myId, 0)

255 +send_discardW aiting(myId, 1)

256 +send_discardW aiting(myId, 2)

257 +P_q,estimate

q,tsq:N.rcvf rom(1, q, round, estimateq, tsq, myId)

258 +Pq,estimateq,tsq:N.rcvf rom(2, q, round, estimateq, tsq, myId)

259 +P_v:N.P_done:B.rcvDecisioF rom(v, done, myId).

260 (done) → decide(myId, v)

261 Crashed(myId, round, ls1, stronglyComplete)

262 +P_q:N.summsg_type : Ack_T ype.

263 rcvAckN ack(q, round, msg_type, myId).Crashed(myId, round, ls1, stronglyComplete)

264 ).Crashed(myId, round, ls1, stronglyComplete);

265

266 Consensus = ΥdiscardW aiting,

267 (∇{broadcast,received,queryF D,decide,DecisionBC,

268 DecisionRcvd,SendAckN ack,AckN ack_rcvd,strongComplte,weakAccuracy 269 ,crashed,discardW aiting,waiting4decision,CF ailure},

270 Γ({send|rcv→broadcast,

271 sendT o|rcvf rom→received, 272 replyQuery|rcv_list→queryF D, 273 send3|rcv3→SendAckN ack,

279 rcv_discardW aiting|send_discardW aiting→discardW aiting, 280 send_addRequest|rcv_addRequest→strongComplte},

281 P hase1(0, 0, 5, 1) k P hase1(1, 0, 7, 1) k P hase1(2, 0, 2, 1) k

282 Channel(0, 0) k Channel(0, 1) k Channel(0, 2) k

285 F D([], 0) k

286 Channel4AckN ack(0, 0) k Channel4AckN ack(0, 1) k Channel4AckN ack(0, 2) k

289 Channel4Decision(0) k Channel4Decision(1) k Channel4Decision(2)

290 ));

291 i n i t 292 Consensus;

Formal analysis of consensus protocols in asynchronous distributed systems

Formal analysis of consensus protocols in asynchronous

distributed systems

Formal Analysis of Consensus Protocols in

Asynchronous Distributed Systems

1 Introduction

2 Consensus Protocols

3 Formal Specication

4 General Requirements

5 Conclusions

Acknowledgements

References

A mCRL2 specication for consensus problem with

strong completeness and weak accuracy

B mCRL2 specication for consensus problem with

strong completeness and eventual weak accuracy

3 Formal Specication

A mCRL2 specication for consensus problem with

B mCRL2 specication for consensus problem with