Measurement and implementation of dependability in a unix environment
暂无分享,去创建一个
We are witnessing today an explosion of demand for low-cost computer systems, as computers are being used in every aspect of our lives. However, current low-end hardware has few fault tolerant capabilities. The objective of this study is to present methods to measure the fault behavior of Unix systems, and to increase the fault tolerance of the applications running on them. An emphasis is set on accomplishing these goals without modifying the hardware or operating system.
The first part of this study focuses on the dependability evaluation of a computer system, using message logs generated by a typical educational-institution Unix server. A method to identify possible dependencies between different error messages is presented.
For the second part of this work, we propose a portable software system that provides fault tolerance on existing Unix platforms for certain applications. The software system is a layer of routines between the operating system and application software that permits active replication, monitoring of system calls and performing consistency checks to permit fault tolerance. The system's architecture is described, as well as the specific methods used to accomplish fault tolerance. Experimental results regarding the system's overhead are presented.