A new checkpoint mechanism for real time operating systems

This paper presents an overview of a proposed protocol to provide application---transparent fault tolerant services in a Real Time Operating system. Fault tolerance is achieved by saving checkpoints of the processes belonging to a real time application. This approach proposes the extension of some real time system calls in order to save a recovery point when the user invokes them. This protocol allows a real time application designer to know the temporal specifications of every system call. Current real time applications are composed of several Real Time processes and they have to share data by using interprocess communication facilities provided by the operating system. The operating system has to take into account these interactions to ensure the consistency of checkpoints. This is done by tracking the communications performed since the last checkpoint and forcing dependent processes to perform a checkpoint at the same time.