A Process Health Status Service for Safety Related Systems Using TT/ET Communication Scheduling
Paper i proceeding, 2008

This paper describes a health status protocol for distributed real-time systems that use TTCAN, Flexray, or other networks which support both time-triggered and event-triggered communication. The protocol allows a group of co-operating processes to establish a consistent view of each other’s health status over time. It extends the instantaneous view, of operational status of each process, provided by a process group membership protocol. The health status and membership protocols are intended for systems where processes (not nodes) are considered the smallest unit of failure, and where process failures can be detected and recovered locally by the host node. Such systems require a decision function that determines whether a process failure is temporary (the process is being recovered by the host node) or permanent (local recovery is not possible or was unsuccessful). Our protocol ensures that such decisions are made consistently among correct nodes despite symmetrical and asymmetrical omission failures.

fault tolerance

redundancy management


memberhsip protocols


Johan Karlsson

Chalmers, Data- och informationsteknik, Nätverk och system

Proc. IEEE 14th Pacific Rim International Symposium on Dependable Computing (PRDC 2008)