Dynamic Determinacy Race Detection for Task Parallelism with Futures

Existing dynamic determinacy race detectors for task-parallel programs are limited to programs with strict computation graphs, where a task can only wait for its descendant tasks to complete. In this paper, we present the first known determinacy race detector for non-strict computation graphs, constructed using futures. The space and time complexity of our algorithm are similar to those of the classical SP-bags algorithm, when using only structured parallel constructs such as spawn-sync and async-finish. In the presence of point-to-point synchronization using futures, the complexity of the algorithm increases by a factor determined by the number of future task creation and get operations as well as the number of non-tree edges in the computation graph. The experimental results show that the slowdown factor observed for our algorithm relative to the sequential version is in the range of 1.00\(\times \) – 9.92\(\times \), which is in line with slowdowns experienced for strict computation graphs in past work.

[1]  Bradford L. Chamberlain,et al.  Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..

[2]  John M. Mellor-Crummey,et al.  On-the-fly detection of data races for programs with nested fork-join parallelism , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[3]  Vivek Sarkar,et al.  A Transformation Framework for Optimizing Task-Parallel Programs , 2013, TOPL.

[4]  Vivek Sarkar,et al.  Race Detection in Two Dimensions , 2015, SPAA.

[5]  Vivek Sarkar,et al.  Efficient data race detection for async-finish parallelism , 2012, Formal Methods Syst. Des..

[6]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[7]  Paul F. Dietz,et al.  Two algorithms for maintaining order in a list , 1987, STOC.

[8]  Charles E. Leiserson,et al.  Efficient detection of determinacy races in Cilk programs , 1997, SPAA '97.

[9]  Samuel Thibault,et al.  Evaluation of OpenMP Dependent Tasks with the KASTORS Benchmark Suite , 2014, IWOMP.

[10]  Vivek Sarkar,et al.  Deadlock-free scheduling of X10 computations with bounded resources , 2007, SPAA '07.

[11]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[12]  Liuba Shrira,et al.  Promises: linguistic support for efficient asynchronous procedure calls in distributed systems , 1988, PLDI '88.

[13]  Laurie Hendren,et al.  Soot: a Java bytecode optimization framework , 2010, CASCON.

[14]  Vivek Sarkar,et al.  Scalable and precise dynamic datarace detection for structured parallelism , 2012, PLDI.

[15]  Edith Schonberg,et al.  An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.

[16]  Stephen N. Freund,et al.  FastTrack: efficient and precise dynamic race detection , 2009, PLDI '09.

[17]  Richard M. Karp,et al.  Parallel Program Schemata , 1969, J. Comput. Syst. Sci..

[18]  Martin Schulz,et al.  ARCHER: Effectively Spotting Data Races in Large OpenMP Applications , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[19]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[20]  Vivek Sarkar,et al.  Brief Announcement: Dynamic Determinacy Race Detection for Task Parallelism with Futures , 2016, SPAA.

[21]  Zhiqiang Ma,et al.  A theory of data race detection , 2006, PADTAD '06.

[22]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[23]  Vivek Sarkar,et al.  Determinacy and Repeatability of Parallel Program Schemata , 2012, 2012 Data-Flow Execution Models for Extreme Scale Computing.

[24]  Michael A. Bender,et al.  On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs , 2004, SPAA '04.

[25]  Vivek Sarkar,et al.  Habanero-Java: the new adventures of old X10 , 2011, PPPJ.

[26]  Lorna Smith,et al.  A benchmark suite for high performance Java , 2000 .

[27]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.