论文信息 - Log-based recovery for middleware servers

Log-based recovery for middleware servers

We have developed new methods for log-based recovery for middleware servers which involve thread pooling, private in-memory states for clients, shared in-memory state and message interactions among middleware servers. Due to the observed rareness of crashes, relatively small size of shared state and infrequency of shared state read/write accesses, we are able to reduce the overhead of message logging and shared state logging while maintaining recovery independence. Checkpointing has a very small impact on ongoing activities while still reducing recovery time. Our recovery mechanism enables client private states to be recovered in parallel after a crash. On a commercial middleware server platform, we have implemented a recovery infrastructure prototype, which demonstrates the manageability of system complexity and shows promising performance results.

[1] Robert E. Strom,et al. Optimistic recovery in distributed systems , 1985, TOCS.

[2] David B. Lomet. Robust Web Services via Interaction Contracts , 2004, TES.

[3] Vijay K. Garg,et al. Optimistic recovery in multi-threaded distributed systems , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[4] Roger S. Barga,et al. Improving logging and recovery performance in Phoenix/App , 2004, Proceedings. 20th International Conference on Data Engineering.

[5] Philip A. Bernstein,et al. Implementing recoverable requests using queues , 1990, SIGMOD '90.

[6] Priya Narasimhan,et al. Enforcing determinism for the consistent replication of multithreaded CORBA applications , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[7] Gerhard Weikum,et al. Efficient transparent application recovery in client-server information systems , 1998, SIGMOD '98.

[8] Koen De Bosschere,et al. Record/replay for nondeterministic program executions , 2003, CACM.

[9] Harrick M. Vin,et al. A fault-tolerant java virtual machine , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[10] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .

[11] Priya Narasimhan,et al. State synchronization and recovery for strongly consistent replicated CORBA objects , 2001, 2001 International Conference on Dependable Systems and Networks.

[12] Hamid Pirahesh,et al. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[13] Gerhard Weikum,et al. Recovery guarantees for general multi-tier applications , 2002, Proceedings 18th International Conference on Data Engineering.

[14] Vijay K. Garg,et al. How to recover efficiently and asynchronously when optimism fails , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.