Engineering Record And Replay For Deployability: Extended Technical Report

The ability to record and replay program executions with low overhead enables many applications, such as reverse-execution debugging, debugging of hard-to-reproduce test failures, and "black box" forensic analysis of failures in deployed systems. Existing record-and-replay approaches limit deployability by recording an entire virtual machine (heavyweight), modifying the OS kernel (adding deployment and maintenance costs), requiring pervasive code instrumentation (imposing significant performance and complexity overhead), or modifying compilers and runtime systems (limiting generality). We investigated whether it is possible to build a practical record-and-replay system avoiding all these issues. The answer turns out to be yes - if the CPU and operating system meet certain non-obvious constraints. Fortunately modern Intel CPUs, Linux kernels and user-space frameworks do meet these constraints, although this has only become true recently. With some novel optimizations, our system 'rr' records and replays real-world low-parallelism workloads with low overhead, with an entirely user-space implementation, using stock hardware, compilers, runtimes and operating systems. "rr" forms the basis of an open-source reverse-execution debugger seeing significant use in practice. We present the design and implementation of 'rr', describe its performance on a variety of workloads, and identify constraints on hardware and operating system design required to support our approach.

[1]  Tal Garfinkel,et al.  Towards Practical Default-On Multi-Core Record/Replay , 2017, ASPLOS.

[2]  Michael Chow,et al.  Eidetic Systems , 2014, OSDI.

[3]  Radu Rugina,et al.  Software Techniques for Avoiding Hardware Virtualization Exits , 2012, USENIX Annual Technical Conference.

[4]  Michael D. Ernst,et al.  Interactive record/replay for web application debugging , 2013, UIST.

[5]  Ion Stoica,et al.  ODR: output-deterministic replay for multicore debugging , 2009, SOSP '09.

[6]  Mike Hibler,et al.  Abstractions for Practical Virtual Machine Replay , 2016, VEE.

[7]  Satish Narayanasamy,et al.  DoublePlay: parallelizing sequential logging and replay , 2011, ASPLOS XVI.

[8]  Chris Gottbrath Reverse Debugging with the TotalView Debugger , 2009 .

[9]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[10]  Min Xu ReTrace : Collecting Execution Trace with Virtual Machine Deterministic Replay , 2007 .

[11]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, International Symposium on Computer Architecture.

[12]  Shirley Moore,et al.  Nondeterminism and Overcount in Hardware Counter Implementations , 2013 .

[13]  Craig Chambers,et al.  Optimizing Dynamically-Typed Object-Oriented Languages With Polymorphic Inline Caches , 1991, ECOOP.

[14]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[15]  Jason Nieh,et al.  Transparent, lightweight application execution replay on commodity multiprocessor operating systems , 2010, SIGMETRICS '10.

[16]  Brendan Dolan-Gavitt,et al.  Repeatable Reverse Engineering with PANDA , 2015, PPREW@ACSAC.

[17]  Sanjay Bhansali,et al.  Framework for instruction-level tracing and analysis of program executions , 2006, VEE '06.

[18]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[19]  Xuezheng Liu,et al.  Usenix Association 8th Usenix Symposium on Operating Systems Design and Implementation R2: an Application-level Kernel for Record and Replay , 2022 .

[20]  Jong-Deok Choi,et al.  A perturbation-free replay platform for cross-optimized multithreaded applications , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[21]  Xuxian Jiang,et al.  Time-Traveling Forensic Analysis of VM-Based High-Interaction Honeypots , 2011, SecureComm.

[22]  J. Engblom,et al.  A review of reverse debugging , 2012, Proceedings of the 2012 System, Software, SoC and Silicon Debug Conference.

[23]  Emery D. Berger,et al.  Dthreads: efficient deterministic multithreading , 2011, SOSP.

[24]  Luis Ceze,et al.  Deterministic Process Groups in dOS , 2010, OSDI.

[25]  Pavel Dovgalyuk Deterministic Replay of System's Execution with Multi-target QEMU Simulator for Dynamic Analysis and Reverse Debugging , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[26]  Yasushi Saito,et al.  Jockey: a user-space library for record-replay debugging , 2005, AADEBUG'05.

[27]  Qin Zhao,et al.  Transparent dynamic instrumentation , 2012, VEE '12.

[28]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[29]  Marek Olszewski,et al.  Kendo: efficient deterministic multithreading in software , 2009, ASPLOS.

[30]  Derek Hower,et al.  Rerun: Exploiting Episodes for Lightweight Memory Race Recording , 2008, 2008 International Symposium on Computer Architecture.

[31]  Scott Shenker,et al.  Replay debugging for distributed applications , 2006 .

[32]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, 2008 International Symposium on Computer Architecture.

[33]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[34]  Josep Torrellas,et al.  QuickRec: prototyping an intel architecture extension for record and replay of multithreaded programs , 2013, ISCA.

[35]  Daniel Aarno,et al.  Full-System Simulation from Embedded to High-Performance Systems , 2010 .

[36]  James Cownie,et al.  PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs , 2010, CGO '10.