RAID: a robust and adaptable distributed system

There is a need to design distributed systems that are not rigid in their choice of algorithms and that are responsive to faults/failures and performance degradation. To meet this challenge, we formalize and experiment with design principles that allow the implementation of an adaptable distributed system. The strategies for dynamic reconfiguration of the subsystems and determining their impact are being studied via experiments on a prototype system called RAID under development at Purdue University. RAID provides system level support for transaction management in a reliable manner. Other transaction based systems are TABS [SBD*85], ARGUS [LS83], and System R* [LHM*84].The key contribution of RAID is the system level support provided for building transaction based applications. RAID provides support for atomic objects and atomic commitment across a set of sites. It also includes concurrency control mechanisms based on time-stamps that provide a variety of choices of methods spanning from two-phase locking to optimistic methods utilizing the semantics of transactions and the objects accessed by them. In addition RAID has site failure and network partition control algorithms integrated with the rest of concurrent transaction processing and a replicated copy control subsystem.