Instant OS Updates via Userspace Checkpoint-and-Restart

In recent years, operating systems have become increasingly complex and thus more prone to security and performance issues. Accordingly, system updates to address these issues have become more frequently available and increasingly important. To complete such updates, users must reboot their systems, resulting in unavoidable downtime and further loss of the states of running applications. We present KUP, a practical OS update mechanism that employs a userspace checkpoint-and-restart mechanism, which uses an optimized data structure for checkpointing on disk as well as a memory persistence mechanism across the update, coupled with a fast in-place kernel switch. This allows for instant kernel updates spanning across major kernel versions without any kernel modifications. Our evaluation shows that KUP can support any type of real kernel patches (e.g., security, minor or even major releases) with large-scale applications that include memcached, mysql, or in the middle of the Linux kernel compilation, unlike well-known dynamic hot-patching techniques (e.g., ksplice). Not only that, KUP can update a running Linux kernel in 3 seconds (overall downtime) without losing 32 GB of memcached data from kernel version v3.17-rc7 to v4.1.

[1]  Nickolai Zeldovich,et al.  Practical and Effective Sandboxing for Non-root Users , 2013, USENIX ATC.

[2]  Jason Nieh,et al.  Reducing Downtime Due to System Maintenance and Upgrades (Awarded Best Student Paper!) , 2005, LISA.

[3]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[4]  Geunsik Lim Faster Booting in Consumer Electronics , 2015 .

[5]  Jim Groom,et al.  Docker - Build, Ship, and Run Any App, Anywhere , 2014 .

[6]  Dilma Da Silva,et al.  Reboots Are for Hardware: Challenges and Solutions to Updating an Operating System on the Fly , 2007, USENIX Annual Technical Conference.

[7]  Antonia Zhai,et al.  Proceeding of the 41st annual international symposium on Computer architecuture , 2014, ISCA 2014.

[8]  Cristiano Giuffrida,et al.  Mutable checkpoint-restart: automating live update for generic server programs , 2014, Middleware.

[9]  Cristiano Giuffrida,et al.  Safe and automatic live update for operating systems , 2013, ASPLOS '13.

[10]  Manuel Oriol,et al.  Practical dynamic software updating for C , 2006, PLDI '06.

[11]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.

[12]  Michael Hicks,et al.  Specifying and Verifying the Correctness of Dynamic Software Updates , 2012, VSTTE.

[13]  Kai Li,et al.  Libckpt: Transparent Checkpointing under UNIX , 1995, USENIX.

[14]  Amirreza Zarrabi,et al.  Linux Support for Fast Transparent General Purpose Checkpoint/Restart of Multithreaded Processes in Loadable Kernel Module , 2013, Journal of Grid Computing.

[15]  Devarshi Ghoshal,et al.  Distributed Speculative Parallelization using Checkpoint Restart , 2011, ICCS.

[16]  Amir Michael,et al.  Facebook: The open compute project , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).

[17]  Gavin M. Bierman,et al.  Mutatis Mutandis: Safe and predictable dynamic software updating , 2007, TOPL.

[18]  Brian N. Bershad,et al.  Improving the reliability of commodity operating systems , 2005, TOCS.

[19]  Julian Rathke,et al.  Dynamic Software Update for Message Passing Programs , 2012, APLAS.

[20]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[21]  Umesh Deshpande,et al.  Post-copy live migration of virtual machines , 2009, OPSR.

[22]  Dilma Da Silva,et al.  System Support for Online Reconfiguration , 2003, USENIX Annual Technical Conference, General Track.

[23]  Tudor Dumitras,et al.  To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains , 2010, OOPSLA.

[24]  Tudor Dumitras,et al.  Why Do Upgrades Fail and What Can We Do about It? , 2009, Middleware.

[25]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[26]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[27]  Hiroshi Yamada,et al.  Traveling forward in time to newer operating systems using ShadowReboot , 2013, VEE '13.

[28]  Donald E. Porter,et al.  Rethinking the library OS from the top down , 2011, ASPLOS XVI.

[29]  Jason Duell,et al.  Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters , 2006 .

[30]  Iulian Neamtiu,et al.  Safe and timely updates to multi-threaded programs , 2009, PLDI '09.

[31]  Brian N. Bershad,et al.  Live Update for Device Drivers , 2008 .

[32]  Jeffrey S. Foster,et al.  Kitsune: efficient, general-purpose dynamic software updating for C , 2012, OOPSLA '12.

[33]  Michael Hicks,et al.  Rubah: Efficient, General-purpose Dynamic Software Updating for Java , 2013, HotSWUp.

[34]  Timothy Roscoe,et al.  Decoupling Cores, Kernels, and Operating Systems , 2014, OSDI.

[35]  Jose Renato Santos,et al.  Cruz: Application-Transparent Distributed Checkpoint-Restart on Standard Operating Systems , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[36]  Ashvin Goel,et al.  Seamless kernel updates , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[37]  M. Frans Kaashoek,et al.  Ksplice: automatic rebootless kernel updates , 2009, EuroSys '09.

[38]  Yasushi Saito,et al.  Devirtualizable virtual machines enabling general, single-node, online maintenance , 2004, ASPLOS XI.

[39]  Michael Stumm,et al.  Otherworld: giving applications a chance to survive OS kernel crashes , 2010, EuroSys '10.

[40]  Nickolai Zeldovich,et al.  Recovering from intrusions in distributed systems with DARE , 2012, APSys.

[41]  D. Bennett,et al.  Contents , 2020, Journal of the American College of Cardiology.

[42]  Michael Hicks,et al.  Contextual effects for version-consistent dynamic software updating and safe concurrent programming , 2008, POPL '08.

[43]  John Whaley,et al.  System Checkpointing Using Reflection and Program Analysis , 2001, Reflection.

[44]  Xi Wang,et al.  Intrusion Recovery Using Selective Re-execution , 2010, OSDI.

[45]  Thomas F. Wenisch,et al.  Memory persistency , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[46]  Irene Zhang,et al.  Optimizing VM Checkpointing for Restore Performance in VMware ESXi , 2013, USENIX Annual Technical Conference.

[47]  Janet L. Wiener,et al.  Fast database restarts at facebook , 2014, SIGMOD Conference.

[48]  Jason Nieh,et al.  Transparent Checkpoint-Restart of Multiple Processes on Commodity Operating Systems , 2007, USENIX Annual Technical Conference.

[49]  Rida A. Bazzi,et al.  Immediate Multi-Threaded Dynamic Software Updates Using Stack Reconstruction , 2009, USENIX Annual Technical Conference.

[50]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[51]  Cristian Cadar,et al.  Safe software updates via multi-version execution , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[52]  Haibo Chen,et al.  Live updating operating systems using virtualization , 2006, VEE '06.

[53]  Michael Hicks,et al.  State transfer for clear and efficient runtime updates , 2011, 2011 IEEE 27th International Conference on Data Engineering Workshops.

[54]  Nickolai Zeldovich,et al.  Efficient Patch-based Auditing for Web Application Vulnerabilities , 2012, OSDI.

[55]  Dilma Da Silva,et al.  Providing Dynamic Update in an Operating System , 2005, USENIX Annual Technical Conference, General Track.

[56]  Jason Nieh,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation , 2022 .

[57]  Zhilei Xu,et al.  Tracking Rootkit Footprints with a Practical Memory Analysis System , 2012, USENIX Security Symposium.

[58]  Gene Cooperman,et al.  DMTCP: Transparent checkpointing for cluster computations and the desktop , 2007, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[59]  Julia L. Lawall,et al.  Efficient incremental checkpointing of Java programs , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[60]  M. Frans Kaashoek,et al.  RadixVM: scalable address spaces for multithreaded applications , 2013, EuroSys '13.