Efficient on-the-fly data race detection in multithreaded C++ programs

Data race detection is highly essential for debugging multithreaded programs and assuring their correctness. Nevertheless, there is no single universal technique capable of handling the task efficiently, since the data race detection problem is computationally hard in the general case. Thus, to approximate the possible races in a program, all currently available tools take different ``short-cuts'', such as using strong assumptions on the program structure or applying various heuristics. When applied to some general case program, however, they usually result in excessive false alarms or in a large number of undetected races.Another major drawback of many currently available tools is that they are restricted, for performance reasons, to detection units of fixed size. Thus, they all suffer from the same problem---choosing a small unit might result in missing some of the data races, while choosing a large one might lead to false detection.In this paper we present a novel testing tool, called MultiRace, which combines improved versions of Djit and Lockset---two very powerful on-the-fly algorithms for dynamic detection of apparent data races. Both extended algorithms detect races in multithreaded programs that may execute on weak consistency systems, and may use two-way as well as global synchronization primitives.By employing novel technologies, MultiRace adjusts its detection to the native granularity of objects and variables in the program under examination. In order to monitor all accesses to each of the shared locations, MultiRace instruments the C++ source code of the program. It lets the user fine-tune the detection process, but otherwise is completely automatic and transparent.This paper describes the algorithms employed in MultiRace, discusses some of its implementation issues, and proposes several optimizations to it. The paper shows that the overheads imposed by MultiRace are often much smaller (orders of magnitude) than those obtained by other existing dynamic techniques.

[1]  Ken Kennedy,et al.  Parallel program debugging with on-the-fly anomaly detection , 1990, Proceedings SUPERCOMPUTING '90.

[2]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[3]  Assaf Schuster,et al.  Efficient on-the-fly data race detection in multithreaded C++ programs , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[4]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[5]  David A. Padua,et al.  Automatic detection of nondeterminacy in parallel programs , 1988, PADD '88.

[6]  Edith Schonberg,et al.  An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.

[7]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[8]  Tim Brecht,et al.  The Region Trap Library: Handling Traps on Application-Defined Regions of Memory , 1999, USENIX Annual Technical Conference, General Track.

[9]  Assaf Schuster,et al.  MultiView and Millipage — fine-grain sharing in page-based DSMs , 1999, OSDI '99.

[10]  Barton P. Miller,et al.  What are race conditions?: Some issues and formalizations , 1992, LOPL.

[11]  Robert H. B. Netzer,et al.  Efficient Race Condition Detection for Shared-Memory Programs with Post/Wait Synchronization , 1992, International Conference on Parallel Processing.

[12]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[13]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[14]  Barton P. Miller,et al.  Detecting Data Races on Weak Memory Systems , 1991, ISCA.

[15]  Assaf Schuster,et al.  Toward Integration of Data Race Detection in DSM Systems , 1999, J. Parallel Distributed Comput..

[16]  James R. Larus,et al.  Protocol-based data-race detection , 1998, SPDT '98.

[17]  Barton P. Miller,et al.  Detecting Data Races in Parallel Program Executions , 1989 .

[18]  Thomas R. Gross,et al.  Object race detection , 2001, OOPSLA '01.

[19]  John M. Mellor-Crummey,et al.  On-the-fly detection of data races for programs with nested fork-join parallelism , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[20]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.

[21]  Charles E. Leiserson,et al.  Detecting data races in Cilk programs that use locks , 1998, SPAA '98.

[22]  Edith Schonberg,et al.  Detecting access anomalies in programs with critical sections , 1991, PADD '91.

[23]  Barton P. Miller,et al.  On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions , 1990, ICPP.

[24]  Ken Kennedy,et al.  Compile-time detection of race conditions in a parallel program , 1989, ICS '89.

[25]  A. Gupta,et al.  Towards Integration of Data Race Detection in DSM Systems , 1999 .

[26]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[27]  Peter J. Keleher,et al.  Online data-race detection via coherency guarantees , 1996, OSDI '96.

[28]  Barton P. Miller,et al.  Improving the accuracy of data race detection , 1991, PPOPP '91.

[29]  Robert H. B. Netzer,et al.  Pace condition detection for debugging shared-memory parallel programs , 1992 .

[30]  John M. Mellor-Crummey,et al.  Compile-time support for efficient data race detection in shared-memory parallel programs , 1993, PADD '93.

[31]  Mark D. Hill,et al.  A Unified Formalization of Four Shared-Memory Models , 1993, IEEE Trans. Parallel Distributed Syst..

[32]  Friedemann Mattern,et al.  Virtual Time and Global States of Distributed Systems , 2002 .

[33]  Stephen N. Freund,et al.  Detecting race conditions in large programs , 2001, PASTE '01.

[34]  Peter J. Keleher,et al.  A Protocol-Centric Approach to on-the-Fly Race Detection , 2000, IEEE Trans. Parallel Distributed Syst..