Formal Specification and Verification of a Protocol for Consistent Diagnosis in Real-Time Embedded Systems
Paper i proceeding, 2008
This paper proposes a membership protocol for fault-tolerant distributed systems and describes the usage of formal verification methods to ascertain its correctness. The protocol allows nodes in a synchronous system to maintain consensus on the set of operational nodes, i.e., the membership, in the presence of omission failures and node restarts. It relies on nodes observing the transmissions of other nodes to detect failures. Consensus is maintained by exchanging a configurable number of acknowledgements for each node's message. Increasing this number makes the protocol resilient to a greater number of simultaneous or near-coincident failures. We used the SPIN model checker to formally verify the correctness of the membership protocol. This paper describes how we modeled the protocol and presents the results of the exhaustively verified model instances.