Leveraging Hardware-Assisted Virtualization for Deterministic Replay on Commodity Multi-Core Processors

Deterministic replay, which provides the ability to travel backward in time and reconstruct the past execution flow of a multiprocessor system, has many prominent applications. Prior research in this area can be classified into two categories: hardware-only schemes and software-only schemes. While hardware-only schemes deliver high performance, they require significant modifications to the existing hardware. In contrast, software-only schemes work on commodity hardware, but suffer from excessive performance overhead and huge logs. In this article, we present the design and implementation of a novel system, Samsara, which uses the hardware-assisted virtualization (HAV) extensions to achieve efficient deterministic replay without requiring any hardware modification. Unlike prior software schemes which trace every single memory access to record interleaving, Samsara leverages HAV on commodity processors to track the read-set and write-set for implementing a chunk-based recording scheme in software. By doing so, we avoid all memory access detections, which is a major source of overhead in prior works. Evaluation results show that compared with prior software-only schemes, Samsara significantly reduces the log file size to 1/70th on average, and further reduces the recording overhead from about <inline-formula><tex-math notation="LaTeX">$10 \times$</tex-math><alternatives> <inline-graphic xlink:href="ren-ieq1-2727492.gif"/></alternatives></inline-formula>, reported by state-of-the-art works, to <inline-formula><tex-math notation="LaTeX">$2.1 \times$</tex-math><alternatives> <inline-graphic xlink:href="ren-ieq2-2727492.gif"/></alternatives></inline-formula> on average.

[1]  Jian Xu,et al.  Adaptive message logging for incremental program replay , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[2]  Depei Qian,et al.  Pacifier: Record and replay for relaxed-consistency multiprocessors with distributed directory protocol , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[3]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, 2008 International Symposium on Computer Architecture.

[4]  Satish Narayanasamy,et al.  DoublePlay: parallelizing sequential logging and replay , 2011, ASPLOS XVI.

[5]  Mark D. Hill,et al.  Karma: scalable deterministic record-replay , 2011, ICS '11.

[6]  Thomas J. LeBlanc,et al.  Debugging Parallel Programs with Instant Replay , 1987, IEEE Transactions on Computers.

[7]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[8]  Fred B. Schneider,et al.  Hypervisor-based fault tolerance , 1996, TOCS.

[9]  Sandhya Dwarkadas,et al.  Refereeing conflicts in hardware transactional memory , 2009, ICS.

[10]  Josep Torrellas,et al.  RelaxReplay: record and replay for relaxed-consistency multiprocessors , 2014, ASPLOS.

[11]  Andreas Haeberlen,et al.  Detecting Covert Timing Channels with Time-Deterministic Replay , 2014, OSDI.

[12]  Haibo Chen,et al.  Scalable deterministic replay in a parallel full-system emulator , 2013, PPoPP '13.

[13]  Josep Torrellas,et al.  Cyrus: unintrusive application-level record-replay for replay parallelism , 2013, ASPLOS '13.

[14]  T. N. Vijaykumar,et al.  Timetraveler: exploiting acyclic races for optimizing memory race recording , 2010, ISCA.

[15]  James Cownie,et al.  PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs , 2010, CGO '10.

[16]  Ali-Reza Adl-Tabatabai,et al.  Architecting a chunk-based memory race recorder in Modern CMPs , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[18]  Josep Torrellas,et al.  BulkSC: bulk enforcement of sequential consistency , 2007, ISCA '07.

[19]  Eugene H. Spafford,et al.  An execution-backtracking approach to debugging , 1991, IEEE Software.

[20]  Frank Mueller,et al.  Elastic and scalable tracing and accurate replay of non-deterministic events , 2013, ICS '13.

[21]  Maged M. Michael,et al.  RingSTM: scalable transactions with a single atomic instruction , 2008, SPAA '08.

[22]  Min Xu ReTrace : Collecting Execution Trace with Virtual Machine Deterministic Replay , 2007 .

[23]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[24]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, International Symposium on Computer Architecture.

[25]  Min Xu,et al.  A regulated transitive reduction (RTR) for longer memory race recording , 2006, ASPLOS XII.

[26]  Raj Jain,et al.  Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks , 1989, Comput. Networks.

[27]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[28]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[29]  Brandon Lucia,et al.  DMP: deterministic shared memory multiprocessing , 2009, IEEE Micro.

[30]  Satish Narayanasamy,et al.  Recording shared memory dependencies using strata , 2006, ASPLOS XII.

[31]  Zhen Xiao,et al.  Samsara: Efficient Deterministic Replay in Multiprocessor Environments with Hardware Virtualization Extensions , 2016, USENIX Annual Technical Conference.

[32]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[33]  Jun Zhu,et al.  Twinkle: A fast resource provisioning mechanism for internet services , 2011, 2011 Proceedings IEEE INFOCOM.

[34]  Tal Garfinkel,et al.  Towards Practical Default-On Multi-Core Record/Replay , 2017, ASPLOS.

[35]  Zhen Xiao,et al.  Samsara: Efficient Deterministic Replay with Hardware Virtualization Extensions , 2015, APSys.

[36]  Michael Chow,et al.  Eidetic Systems , 2014, OSDI.

[37]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[38]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[39]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.