This paper proposes a passive replication mechanism which establishes complete location and replication independence of program modules. The proposed method is based on an object model and a reliable broadcast mechanism, where each object has an automatic checkpointing mechanism within itself. The automatic checkpointing mechanism frees users from the need to designate checkpoints in application objects.By the proposed mechanism, replication degree and execution mode of each object can be changed dynamically in accordance with the required reliability and responsibility, and also passive replication and active replication methods can be mixed. In addition, because checkpointing and failure detection are carried out by passive modules, extra operations for maintaining reliability can be removed from active modules.The proposed method is applied to a train traffic control system; its effectiveness and performance characteristics are also described.
[1]
Partha Dasgupta,et al.
Fault Tolerant Computing in Object Based Distributed Operating Systems
,
1987,
SRDS.
[2]
RICHARD KOO,et al.
Checkpointing and Rollback-Recovery for Distributed Systems
,
1986,
IEEE Transactions on Software Engineering.
[3]
Fred B. Schneider,et al.
Byzantine generals in action: implementing fail-stop processors
,
1984,
TOCS.
[4]
Amr El Abbadi,et al.
Implementing Fault-Tolerant Distributed Objects
,
1985,
IEEE Transactions on Software Engineering.
[5]
Anita Borg,et al.
A message system supporting fault tolerance
,
1983,
SOSP '83.
[6]
Tetsuo Hasegawa,et al.
An Operating System for Intellectual Distributed Processing System-An Object Oriented Approach based on Broadcast Communication
,
1991
.