Accurate and efficient filtering for the Intel thread checker race detector

Debugging data races in parallel applications is a difficult task. Error-causing data races may appear to vanish due to changes in an application's optimization level, thread scheduling, whether or not a debugger is used, and other effects. Further, many race conditions cause incorrect program behavior only in rare scenarios and may lie undetected during software testing.Tools exist today that do a decent job in finding data races in multi-threaded applications. Some data-race detection tools are very efficient and can detect data races with less than a 2x performance penalty. Most such tools, however, do not provide enough information to the user, require recompilation, or impose other usage restrictions. Other tools, such as the one considered in this paper (Intel's Thread Checker), provide users with plenty of useful information and can be used with any application binary, but have high overheads - often over 200x. It is the goal of this paper to speed up Thread Checker by filtering out the vast majority of memory references that are highly unlikely to be involved in data races. In our work, we develop filters that filter 90-100% of all memory references from the datarace detection algorithm, resulting in speedups of 2.2-5.5x, with an average improvement of 3.3x.

[1]  J. Torrellas,et al.  ReEnact: using thread-level speculation mechanisms to debug data races in multithreaded codes , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[2]  Barton P. Miller,et al.  Improving the accuracy of data race detection , 1991, PPOPP '91.

[3]  Jong-Deok Choi,et al.  Deterministic replay of Java multithreaded applications , 1998, SPDT '98.

[4]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[5]  Friedemann Mattern,et al.  Virtual Time and Global States of Distributed Systems , 2002 .

[6]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[7]  Colin J. Fidge,et al.  Logical time in distributed computing systems , 1991, Computer.

[8]  Tushara C Karunaratna Nondeterminator-3 : a provably good data-race detector that runs in parallel , 2005 .

[9]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[10]  Jong-Deok Choi,et al.  Efficient and precise datarace detection for multithreaded object-oriented programs , 2002, PLDI '02.

[11]  Koen De Bosschere,et al.  RecPlay: a fully integrated practical record/replay system , 1999, TOCS.

[12]  Charles E. Leiserson,et al.  Detecting data races in Cilk programs that use locks , 1998, SPAA '98.

[13]  Jong-Deok Choi,et al.  Hybrid dynamic data race detection , 2003, PPoPP '03.

[14]  John M. Mellor-Crummey,et al.  On-the-fly detection of data races for programs with nested fork-join parallelism , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).