A Design of a Transparent Backup System Using a Main Memory Database

In this paper, we mainly focus on a problem to realize highperformance systems without sacrificing system reliability. One solution for high-performance is to use main memory as ‘a database storage. Since main memory cannot be free from software bug or electricity hazard, we cannot avoid archive storages such as disks. To reduce the time of checkpoint and recovery for a main memory database, some mechanism is required. For this purpose we have developed continuous backup RAMS which can store their contents periodically to an archive storage while they are used for usual process. By extending this concept continuous backup disks are also designed. A transparent backup system is organized using such RAMS and disks. As a backup process is performed by hardware mechanism during the execution of usual database operations, a backup process does not require to stop a system or to keep transactions waiting to record system status at each checkpoint and thus, users are not aware of the existence of a backup process. Furthermore, an interval between two adjacent checkpoints can be shortened, since hardware control does not require much computation overhead. This short interval contributes to reduce recovery time. we must still stop a system for recovery when some serious failure occurs. 1 Introduct ion Recently, databases have been widely used in systems of varTo realize high reliance a fault-tolerant system employs a dual system approach which avoids time-consuming recovery process[CER85][KIM84]. In that system data are available regardless of any single failure of a system. Repairs can be done without affecting availability of data and it is assumed that repairs can be completed before next failure. It is required, however, to record system status for multiple failure cases. A fault-tolerant system is designed to be based on the disk resident database, since main memory in general does not have enough reliability compared with a disk, even if it is nonvolatile. The disk resident system has a serious drawback due to disk I/O bottleneck to realize a high performance system. A highly reliable main memory database system which has both properties of high performance and high reliance should be developed to solve these problems. In order to realize such systems continuous backup RAMS are designed by the authors[KAMSl]. These RAMS can store their contents to archive storage while they are used for usual process. In this paper, continuous backup disks are introduced which are extensions of continuous backup RAMS. A system architecture to reduce overhead for recovery using such RAMS and disks is discussed. ious fields such as online process control. This kind of systems must have the property of a real-time system, i.e. all processes can be finished before predefined deadline, and also must have the property of a fault-tolerant system where control operations should not stop even if a system fails, since the process of a plant cannot be stopped instantaneously. If a system cannot decide an appropriate action to prevent a plant from being runaway within predetermined time or if a system stops for even a very short period, then it may become unable to control a whole plant and we may suffer a great loss. For such applications, a high performance and highly reliable system must be realized. We supposed only the case where current data are lost by some failure such as system failure and media failure, and at least one checkpoint data can be available. Transaction failure is not considered here, since it does not make data lost and should be recovered by database management system. Conventional checkpoint and recovery processes are as follows. A) At each checkpoint, dirty pages or whole database pages are dumped to an archive storage, such as tapes or disks. B) While database operations are performed, log records are stored in a stable storage (tapes or disks). To achieve high performance, a lot of papers have been pubC) In case of system failure, the status at the latest checklished on query processing, concurrency control, database mapoint is transferred to a system from its archive storage chines and main memory databases. One should notice, howand a system can restart from this checkpoint. In case of ever, that even if high speed database operations are realized media failure, a system must be repaired at first.

[1]  Margaret H. Eich,et al.  Main memory database research directions , 1989 .

[2]  Michael J. Carey,et al.  A recovery algorithm for a high-performance memory-resident database system , 1987, SIGMOD '87.

[3]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[4]  Won Kim Highly available systems for database applications , 1984, CSUR.

[5]  Margaret H. Dunham,et al.  MARS: The Design of a Main Memory Database Machine , 1987, IWDM.

[6]  Jim Gray,et al.  The Benchmark Handbook for Database and Transaction Systems , 1993 .

[7]  Michael J. Carey,et al.  A Concurrency Control Algorithm for Memory-Resident Database Systems , 1989, FODO.

[8]  Chandrasekaran Mohan,et al.  Algorithms for the management of remote backup data bases for disaster recovery , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[9]  Margaret H. Eich Main memory database recovery , 1986 .

[10]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[11]  Margaret H. Eich A classification and comparison of main memory database recovery techniques , 1987, 1987 IEEE Third International Conference on Data Engineering.

[12]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[13]  Gail E. Kaiser,et al.  Concurrency control in advanced database applications , 1991, CSUR.

[14]  M. H. Eich Mmdb Reload Algorithms T , .

[15]  Dieter Gawlick,et al.  Processing "Hot Spots" in High Performance Systems , 1985, COMPCON.

[16]  Vijay Kumar,et al.  Performance measurement of some main memory database recovery algorithms , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.

[17]  Yahiko Kambayashi,et al.  Continuous backup systems utilizing flash memory , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[18]  Yahiko Kambayashi,et al.  Realization of Continuously Backed-up RAMs for High-Speed Database Recovery , 1991, DASFAA.

[19]  Robert B. Hagmann A Crash Recovery Scheme for a Memory-Resident Database System , 1986, IEEE Transactions on Computers.