Efficient Detection of Determinacy Races in Cilk Programs

Abstract. A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, and they result when one thread updates a location in shared memory while another thread is concurrently accessing the location. We have implemented a provably efficient determinacy-race detector for Cilk, an algorithmic multithreaded programming language. If a Cilk program is run on a given input data set, our debugging tool, which we call the ``Nondeterminator,'' either determines at least one location in the program that is subject to a determinacy race, or else it certifies that the program is race free when run on the data set. The core of the Nondeterminator is an asymptotically efficient serial algorithm (inspired by Tarjan's nearly linear-time least-common-ancestors algorithm) for detecting determinacy races in series-parallel directed acyclic graphs. For a Cilk program that runs in T time on one processor and uses v shared-memory locations, the Nondeterminator runs in O(T α(v,v)) time, where α is Tarjan's functional inverse of Ackermann's function, a very slowly growing function which, for all practical purposes, is bounded above by 4 . The Nondeterminator uses at most a constant factor more space than does the original program. On a variety of Cilk program benchmarks, the Nondeterminator exhibits a slowdown of less than 12 compared with the serial execution time of the original optimized code, which we contend is an acceptable slowdown for debugging purposes.

[1]  Peter J. Keleher,et al.  Online data-race detection via coherency guarantees , 1996, OSDI '96.

[2]  Charles E. Leiserson,et al.  Detecting data races in Cilk programs that use locks , 1998, SPAA '98.

[3]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[4]  Robert E. Tarjan,et al.  Applications of Path Compression on Balanced Trees , 1979, JACM.

[5]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[6]  David Padua,et al.  Debugging Fortran on a shared memory machine , 1987 .

[7]  TarjanRobert Endre,et al.  Fast algorithms for finding nearest common ancestors , 1984 .

[8]  David A. Padua,et al.  Event synchronization analysis for debugging parallel programs , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[9]  Matteo Frigo,et al.  An analysis of dag-consistent distributed shared-memory algorithms , 1996, SPAA '96.

[10]  Robert H. B. Netzer,et al.  Efficient Race Condition Detection for Shared-Memory Programs with Post/Wait Synchronization , 1992, International Conference on Parallel Processing.

[11]  Jong-Deok Choi,et al.  An efficient cache-based access anomaly detection scheme , 1991, ASPLOS IV.

[12]  Arthur J. Bernstein,et al.  Analysis of Programs for Parallel Processing , 1966, IEEE Trans. Electron. Comput..

[13]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[14]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[15]  Edith Schonberg,et al.  Detecting access anomalies in programs with critical sections , 1991, PADD '91.

[16]  Mellor-CrummeyJohn,et al.  Compile-time support for efficient data race detection in shared-memory parallel programs , 1993 .

[17]  Jong-Deok Choi,et al.  A Mechanism for Efficient Debugging of Parallel Programs , 1988, PLDI.

[18]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[19]  David A. Padua,et al.  Automatic detection of nondeterminacy in parallel programs , 1988, PADD '88.

[20]  Jacobo Valdes Ayesta Parsing flowcharts and series-parallel graphs , 1978 .

[21]  Guy L. Steele,et al.  Making asynchronous parallelism safe for the world , 1989, POPL '90.

[22]  John M. Mellor-Crummey,et al.  On-the-fly detection of data races for programs with nested fork-join parallelism , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[23]  Edith Schonberg,et al.  An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.

[24]  Uzi Vishkin,et al.  On Finding Lowest Common Ancestors: Simplification and Parallelization , 1988, AWOC.

[25]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multi-threaded programs , 1997, TOCS.

[26]  Claude E. Shannon,et al.  The Number of Two‐Terminal Series‐Parallel Networks , 1942 .

[27]  P MillerBarton,et al.  What are race conditions , 1992 .

[28]  Alfred V. Aho,et al.  On finding lowest common ancestors in trees , 1973, SIAM J. Comput..

[29]  R. Duffin Topology of series-parallel networks , 1965 .