Concurrent breakpoints

In program debugging, reproducibility of bugs is a key requirement. Unfortunately, bugs in concurrent programs are notoriously difficult to reproduce because bugs due to concurrency happen under very specific thread schedules and the likelihood of taking such corner-case schedules during regular testing is very low. We propose concurrent breakpoints, a light-weight and programmatic way to make a concurrency bug reproducible. We describe a mechanism that helps to hit a concurrent breakpoint in a concurrent execution with high probability. We have implemented concurrent breakpoints as a light-weight library for Java and C/C++ programs. We have used the implementation to deterministically reproduce several known non-deterministic bugs in real-world concurrent Java and C/C++ programs with almost 100% probability.

[1]  Rahul Agarwal,et al.  Optimized run-time race detection and atomicity checking using partial discovered types , 2005, ASE.

[2]  Jens Krinke,et al.  Context-sensitive slicing of concurrent programs , 2003, ESEC/FSE-11.

[3]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[4]  Jong-Deok Choi,et al.  Deterministic replay of Java multithreaded applications , 1998, SPDT '98.

[5]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[6]  Jong-Deok Choi,et al.  Efficient and precise datarace detection for multithreaded object-oriented programs , 2002, PLDI '02.

[7]  Azadeh Farzan,et al.  The Complexity of Predicting Atomicity Violations , 2009, TACAS.

[8]  Ion Stoica,et al.  ODR: output-deterministic replay for multicore debugging , 2009, SOSP '09.

[9]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[10]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, International Symposium on Computer Architecture.

[11]  Yuanyuan Zhou,et al.  CTrigger: exposing atomicity violation bugs from their hiding places , 2009, ASPLOS.

[12]  Xiangyu Zhang,et al.  Efficient forward computation of dynamic slices using reduced ordered binary decision diagrams , 2004, Proceedings. 26th International Conference on Software Engineering.

[13]  Koushik Sen,et al.  A randomized dynamic program analysis technique for detecting real deadlocks , 2009, PLDI '09.

[14]  Madan Musuvathi,et al.  Iterative context bounding for systematic testing of multithreaded programs , 2007, PLDI '07.

[15]  Francesco Sorrentino,et al.  PENELOPE: weaving threads to expose atomicity violations , 2010, FSE '10.

[16]  Bensalem Saddek,et al.  SCALABLE DYNAMIC DEADLOCK ANALYSIS OF MULTI-THREADED PROGRAMS , 2005 .

[17]  Koushik Sen,et al.  Randomized active atomicity violation detection in concurrent programs , 2008, SIGSOFT '08/FSE-16.

[18]  Koushik Sen,et al.  Race directed random testing of concurrent programs , 2008, PLDI '08.

[19]  Robert H. B. Netzer Optimal tracing and replay for debugging shared-memory parallel programs , 1993, PADD '93.

[20]  Koushik Sen,et al.  CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs , 2009, CAV.

[21]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[22]  Dennis Giffhorn,et al.  Precise slicing of concurrent programs , 2009, Automated Software Engineering.

[23]  Shan Lu,et al.  ConMem: detecting severe concurrency bugs through an effect-oriented approach , 2010, ASPLOS XV.

[24]  Grigore Rosu,et al.  Improved multithreaded unit testing , 2011, ESEC/FSE '11.

[25]  Shing-Chi Cheung,et al.  Detecting atomic-set serializability violations in multithreaded programs through active randomized testing , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[26]  Shan Lu,et al.  Cooperative crug isolation , 2009, WODA '09.

[27]  Stephen N. Freund,et al.  Atomizer: A dynamic atomicity checker for multithreaded programs , 2008, Sci. Comput. Program..

[28]  ChoiJong-Deok,et al.  Efficient and precise datarace detection for multithreaded object-oriented programs , 2002 .

[29]  Alex Groce,et al.  What Went Wrong: Explaining Counterexamples , 2003, SPIN.

[30]  Edith Schonberg,et al.  Detecting access anomalies in programs with critical sections , 1991, PADD '91.

[31]  Koen De Bosschere,et al.  Cyclic Debugging Using Execution Replay , 2001, International Conference on Computational Science.

[32]  Shmuel Ur,et al.  Deadlocks: From Exhibiting to Healing , 2008, RV.

[33]  Thomas Ball,et al.  Finding and Reproducing Heisenbugs in Concurrent Programs , 2008, OSDI.

[34]  Josep Torrellas,et al.  DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Ef?ciently , 2008, 2008 International Symposium on Computer Architecture.

[35]  Mangala Gowri Nanda,et al.  Interprocedural slicing of multithreaded programs with applications to Java , 2006, TOPL.

[36]  Thomas J. LeBlanc,et al.  Debugging Parallel Programs with Instant Replay , 1987, IEEE Transactions on Computers.

[37]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[38]  Mark Russinovich,et al.  Replay for concurrent non-deterministic shared-memory applications , 1996, PLDI '96.

[39]  Alex Groce,et al.  SPECIAL S ECTION O N T OOLS A ND A LGORITHMS F OR THE C ONSTRUCTION A ND A NALYSIS O F S YSTEMS , 2005 .

[40]  Klaus Havelund,et al.  Model checking programs , 2000, Proceedings ASE 2000. Fifteenth IEEE International Conference on Automated Software Engineering.

[41]  Matthew B. Dwyer,et al.  Exploiting Object Escape and Locking Information in Partial-Order Reductions for Concurrent Object-Oriented Programs , 2004, Formal Methods Syst. Des..

[42]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.

[43]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[44]  Yuanyuan Zhou,et al.  PRES: probabilistic replay with execution sketching on multiprocessors , 2009, SOSP '09.

[45]  Pravesh Kothari,et al.  A randomized scheduler with probabilistic guarantees of finding bugs , 2010, ASPLOS XV.

[46]  Yuanyuan Zhou,et al.  Learning from mistakes: a comprehensive study on real world concurrency bug characteristics , 2008, ASPLOS.