Self-stabilizing multivalued consensus in asynchronous crash-prone systems
Journal article, 2024
The multivalued consensus problem is a fundamental issue in fault-tolerant distributed computing. It encompasses a wide range of agreement problems where processes must unanimously decide on a specific value v∈V, with |V|≥2. Existing solutions that handle process crash failures simplify the multivalued consensus problem by reducing it to the binary consensus problem. Examples of such solutions include Mostéfaoui-Raynal-Tronel [IPL 2000] and Zhang-Chen [IPL 2009]. In this work, we aim to design an even more reliable solution by leveraging the concept of self-stabilization, which provides a strong form of fault tolerance. Self-stabilizing algorithms can recover from transient faults, which represent any deviation from the system's intended behavior (as long as the algorithm code remains intact) in addition to process and communication failures. To the best of our knowledge, this work presents the first self-stabilizing algorithm for multivalued consensus in asynchronous message-passing systems susceptible to process failures and transient faults. Our solution uses, at most, n concurrent invocations of binary consensus. This is another way we advance state-of-the-art solutions compared to previous non-self-stabilizing ones. For example, Mostéfaoui-Raynal-Tronel's solution requires an unbounded number of sequential invocations of binary consensus.