Usenix Association 8th Usenix Symposium on Operating Systems Design and Implementation R2: an Application-level Kernel for Record and Replay

Library-based record and replay tools aim to reproduce an application's execution by recording the results of selected functions in a log and during replay returning the results from the log rather than executing the functions. These tools must ensure that a replay run is identical to the record run. The challenge in doing so is that only invocations of a function by the application should be recorded, recording the side effects of a function call can be difficult, and not executing function calls during replay, multithreading, and the presence of the tool may change the application's behavior from recording to replay. These problems have limited the use of such tools. R2 allows developers to choose functions that can be recorded and replayed correctly. Developers annotate the chosen functions with simple keywords so that R2 can handle calls with side effects andmultithreading. R2 generates code for record and replay from templates, allowing developers to avoid implementing stubs for hundreds of functions manually. To track whether an invocation is on behalf of the application or the implementation of a selected function, R2 maintains a mode bit, which stubs save and restore. We have implemented R2 on Windows and annotated large parts (1,300 functions) of the Win32 API, and two higher-level interfaces (MPI and SQLite). R2 can replay multithreaded web and database servers that previous library-based tools cannot replay. By allowing developers to choose high-level interfaces, R2 can also keep recording overhead small; experiments show that its recording overhead for Apache is approximately 10%, that recording and replaying at the SQLite interface can reduce the log size up to 99% (compared to doing so at the Win32 API), and that using optimization annotations for BitTorrent and MPI applications achieves log size reduction ranging from 13.7% to 99.4%.

[1]  Satish Narayanasamy,et al.  Recording shared memory dependencies using strata , 2006, ASPLOS XII.

[2]  Jong-Deok Choi,et al.  Deterministic replay of distributed Java applications , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[3]  Xuezheng Liu,et al.  D3S: Debugging Deployed Distributed Systems , 2008, NSDI.

[4]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.

[5]  Galen C. Hunt,et al.  Detours: binary interception of Win32 functions , 1999 .

[6]  Andrew W. Appel,et al.  A Debugger for Standard ML , 1995, Journal of Functional Programming.

[7]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[8]  Scott Shenker,et al.  Replay debugging for distributed applications , 2006 .

[9]  Wei Lin,et al.  Towards Pragmatic Library-based Replay , 2008 .

[10]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[11]  Zhe Yang,et al.  Modular checking for buffer overflows in the large , 2006, ICSE.

[12]  Koen De Bosschere,et al.  Execution replay for an MPI-based multi-threaded runtime system , 1999, PARCO.

[13]  Haoxiang Lin,et al.  Hang analysis: fighting responsiveness bugs , 2008, Eurosys '08.

[14]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[15]  Yasushi Saito,et al.  Jockey: a user-space library for record-replay debugging , 2005, AADEBUG'05.

[16]  Mao Yang,et al.  PacificA: Replication in Log-Based Distributed Storage Systems , 2008 .

[17]  Wenguang Chen,et al.  MPIWiz: subgroup reproducible replay of mpi applications , 2009, PPoPP '09.

[18]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[19]  George C. Necula,et al.  SafeDrive: safe and recoverable extensions using language-based techniques , 2006, OSDI '06.

[20]  Xuezheng Liu,et al.  Towards Automatic Inference of Task Hierarchies in Complex Systems , 2008, HotDep.

[21]  Martín Abadi,et al.  XFI: software guards for system address spaces , 2006, OSDI '06.

[22]  Samuel T. King,et al.  Debugging Operating Systems with Time-Traveling Virtual Machines (Awarded General Track Best Paper Award!) , 2005, USENIX Annual Technical Conference, General Track.

[23]  Willy Zwaenepoel,et al.  Execution replay for treadmarks , 1997, PDP.

[24]  Sanjay Bhansali,et al.  Framework for instruction-level tracing and analysis of program executions , 2006, VEE '06.

[25]  Wei Lin,et al.  WiDS Checker: Combating Bugs in Distributed Systems , 2007, NSDI.

[26]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[27]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[28]  Yu Chen,et al.  Islands in the MSN messenger buddy network , 2008, SocialNets '08.

[29]  Martin K. Purvis,et al.  Performance evaluation of view-oriented parallel programming , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[30]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[31]  Srikanth Kandula,et al.  Flashback: A Light-weight Rollback and Deterministic Replay Extension for Software Debugging , 2004 .