HARDWARE AND SOFTWARE APPROACHES FOR DETERMINISTIC MULTI-PROCESSOR REPLAY OF CONCURRENT PROGRAMS

As multi-processors become mainstream, software developers must harness the parallelism available in programs to keep up with multi-core performance. Writing parallel programs, however, is notoriously diffi cult, even for the most advanced programmers. " e main reason for this lies in the nondeterministic nature of concurrent programs, which makes it very diffi cult to reproduce a program execution. As a result, reasoning about program behavior is challenging. For instance, debugging concurrent programs is known to be diffi cult because of the non-determinism of multi-threaded programs. Malicious code can hide behind non-determinism, making software vulnerabilities much more diffi cult to detect on multi-threaded programs. In this article, we explore hardware and software avenues for improving the programmability of Intel® multi-processors. In particular, we investigate techniques for reproducing a non-deterministic program execution that can effi ciently deal with the issues just mentioned. We identify the main challenges associated with these techniques, examine opportunities to overcome some of these challenges, and explore potential usage models of program execution reproducibility for debugging and fault tolerance of concurrent programs.

[1]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, International Symposium on Computer Architecture.

[2]  Samuel T. King,et al.  Detecting past and present intrusions through vulnerability-specific predicates , 2005, SOSP '05.

[3]  David Lorge Parnas,et al.  Concurrent control with “readers” and “writers” , 1971, CACM.

[4]  Yasushi Saito,et al.  Jockey: a user-space library for record-replay debugging , 2005, AADEBUG'05.

[5]  Marvin V. Zelkowitz Reversible execution , 1973, CACM.

[6]  Jong-Deok Choi,et al.  Deterministic replay of Java multithreaded applications , 1998, SPDT '98.

[7]  Yuanyuan Zhou,et al.  PRES: probabilistic replay with execution sketching on multiprocessors , 2009, SOSP '09.

[8]  Volkmar Sieh,et al.  Framework for testing the fault-tolerance of systems including OS and network aspects , 2001, Proceedings Sixth IEEE International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking.

[9]  Håkan Grahn,et al.  Transactional memory , 2010, J. Parallel Distributed Comput..

[10]  Fred B. Schneider,et al.  Hypervisor-based fault tolerance , 1996, TOCS.

[11]  Peter M. Chen,et al.  Execution replay for intrusion analysis , 2006 .

[12]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[13]  Marek Olszewski,et al.  Kendo: efficient deterministic multithreading in software , 2009, ASPLOS.

[14]  Hiralal Agrawal,et al.  Towards automatic debugging of computer programs , 1992 .

[15]  Tal Garfinkel,et al.  VMwareDecoupling Dynamic Program Analysis from Execution in Virtual Environments , 2008, USENIX Annual Technical Conference.

[16]  Ali-Reza Adl-Tabatabai,et al.  Architecting a chunk-based memory race recorder in Modern CMPs , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.

[18]  Stuart I. Feldman,et al.  IGOR: a system for program debugging via reversible execution , 1988, PADD '88.

[19]  Samuel T. King,et al.  Debugging Operating Systems with Time-Traveling Virtual Machines (Awarded General Track Best Paper Award!) , 2005, USENIX Annual Technical Conference, General Track.

[20]  E. N. Elnozahy,et al.  Supporting nondeterministic execution in fault-tolerant systems , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[21]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[22]  Josep Torrellas,et al.  BulkSC: bulk enforcement of sequential consistency , 2007, ISCA '07.

[23]  Min Xu ReTrace : Collecting Execution Trace with Virtual Machine Deterministic Replay , 2007 .

[24]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, 2008 International Symposium on Computer Architecture.

[25]  James R. Larus,et al.  Transactional Memory , 2006, Transactional Memory.

[26]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[27]  Xuezheng Liu,et al.  Usenix Association 8th Usenix Symposium on Operating Systems Design and Implementation R2: an Application-level Kernel for Record and Replay , 2022 .

[28]  Bob Boothe A fully capable bidirectional debugger , 2000, SOEN.

[29]  Ion Stoica,et al.  ODR: output-deterministic replay for multicore debugging , 2009, SOSP '09.

[30]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[31]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[32]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[33]  Thomas J. LeBlanc,et al.  Debugging Parallel Programs with Instant Replay , 1987, IEEE Transactions on Computers.

[34]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[35]  Milos Prvulovic,et al.  CORD: cost-effective (and nearly overhead-free) order-recording and data race detection , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[36]  Sanjay Bhansali,et al.  Framework for instruction-level tracing and analysis of program executions , 2006, VEE '06.

[37]  Derek Hower,et al.  Rerun: Exploiting Episodes for Lightweight Memory Race Recording , 2008, 2008 International Symposium on Computer Architecture.

[38]  Scott Shenker,et al.  Replay debugging for distributed applications , 2006 .

[39]  Authors' Biographies , 2005 .

[40]  Josep Torrellas,et al.  Capo: a software-hardware interface for practical deterministic multiprocessor replay , 2009, ASPLOS.

[41]  Mark Russinovich,et al.  Replay for concurrent non-deterministic shared-memory applications , 1996, PLDI '96.

[42]  Cristiano Pereira,et al.  Reproducible user-level simulation of multi-threaded workloads , 2007 .

[43]  Samuel T. King,et al.  Operating System Support for Virtual Machines , 2003, USENIX Annual Technical Conference, General Track.