When consensus meets self-stabilization
Journal article, 2010

This paper presents a shared-memory self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine. Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. Started in an arbitrary state, the long lived, memory bounded and self-stabilizing failure detector, asynchronous consensus, and replicated state-machine suite, presented in the paper, recovers to satisfy eventual safety and eventual liveness requirements. Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance-unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 2(64) sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used to implement efficient self-stabilizing timestamps that are of independent interest. (C) 2010 Elsevier Inc. All rights reserved.

Failure detector

State-machine

reset

systems

failure detectors

Distributed

impossibility

Self-stabilization

Wait-free

time

Consensus

Author

Shlomi Dolev

Ben-Gurion University of the Negev

R. I. Kat

Ben-Gurion University of the Negev

Elad Schiller

Chalmers, Computer Science and Engineering (Chalmers), Networks and Systems (Chalmers)

Journal of Computer and System Sciences

0022-0000 (ISSN) 1090-2724 (eISSN)

Vol. 76 8 884-900

Subject Categories

Computer and Information Science

DOI

10.1016/j.jcss.2010.05.005

More information

Created

10/7/2017