Toward upgrades-as-a-service in distributed systems

Unavailability in distributed enterprise systems is usually the result of planned events, such as upgrades, rather than failures. Major system upgrades entail complex data conversions that are difficult to perform on the fly, in the face of live workloads. Minimizing the downtime imposed by such conversions is a time-intensive and error-prone manual process. We propose upgrades-as-a-service, a novel approach that can eliminate all the causes of planned downtime recorded during the upgrade history of one of the ten most popular websites. Building on the lessons learned from past research on live upgrades in middleware systems, upgrades-as-a-service trade off a need for additional hardware resources during the upgrade for the ability to perform end-to-end upgrades online, with minimal application-specific knowledge.

[1]  Willy Zwaenepoel,et al.  C-JDBC: Flexible Database Clustering Middleware , 2004, USENIX Annual Technical Conference, FREENIX Track.

[2]  Tudor Dumitras,et al.  No Downtime for Data Conversions: Rethinking Hot Upgrades (CMU-PDL-09-106) , 2009 .

[3]  P. Narasimhan,et al.  Eternal: fault tolerance and live upgrades for distributed object systems , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[4]  Luís E. T. Rodrigues,et al.  On the Use of a Reflective Architecture to Augment Database Management Systems , 2007, J. Univers. Comput. Sci..

[5]  Tudor Dumitras,et al.  Why Do Upgrades Fail and What Can We Do about It? , 2009, Middleware.