On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two algorithms, one serial and one parallel, to maintain series-parallel (<b><i>SP</i></b>) relationships "on the fly" for fork-join multithreaded programs. The serial <b><i>SP-order</i></b> algorithm runs in <i>O</i>(1) amortized time per operation. In contrast, the previously best algorithm requires a time per operation that is proportional to Tarjan's functional inverse of Ackermann's function. SP-order employs an order-maintenance data structure that allows us to implement a more efficient "English-Hebrew" labeling scheme than was used in earlier race detectors, which immediately yields an improved determinacy-race detector. In particular, any fork-join program running in <i>T</i><inf>1</inf> time on a single processor can be checked on the fly for determinacy races in <i>O</i>(<i>T</i><inf>1</inf>) time. Corresponding improved bounds can also be obtained for more sophisticated data-race detectors, for example, those that use locks.By combining SP-order with Feng and Leiserson's serial <b><i>SP-bags</i></b> algorithm, we obtain a parallel SP-maintenance algorithm, called <b><i>SP-hybrid</i></b>. Suppose that a fork-join program has <i>n</i> threads, <i>T</i><inf>1</inf> work, and a critical-path length of <i>T</i><inf>∞</inf>. When executed on <i>P</i> processors, we prove that SP-hybrid runs in <i>O</i>((<i>T</i><inf>1</inf>/<i>P</i> +<i>PT,/i>∞)lg <i.n</i>) expected time. To understand this bound, consider that the original program obtains linear speed-up over a 1-processor execution when <i>P</i>=<i>O</i>(<i>T</i><inf>1</inf>T<inf>∞</inf>). In contrast, SP-hybrid obtains linear speed-up when <i>P</i>=<i>O</i>(√<i>T</i><inf>1</inf><i>T</i><inf>∞</inf>), but the work is increased by a factor of <i>O</i>(lg <i>n</i>).

[1]  Athanasios K. Tsakalidis Maintaining order in a generalized linked list , 2004, Acta Informatica.

[2]  Mikkel Thorup,et al.  Compact routing schemes , 2001, SPAA '01.

[3]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[4]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[5]  Ola Petersson,et al.  Approximate Indexed Lists , 1998, J. Algorithms.

[6]  Dan E. Willard,et al.  Good worst-case algorithms for inserting and deleting records in dense sequential files , 1986, SIGMOD '86.

[7]  Paul F. Dietz,et al.  A Tight Lower Bound for Online Monotonic List Labeling , 2004, SIAM J. Discret. Math..

[8]  Lenore Cowen,et al.  Compact roundtrip routing with topology-independent node names , 2003, PODC '03.

[9]  David Peleg,et al.  Labeling schemes for flow and connectivity , 2002, SODA '02.

[10]  Haim Kaplan,et al.  A comparison of labeling schemes for ancestor queries , 2002, SODA '02.

[11]  Robert E. Tarjan,et al.  Applications of Path Compression on Balanced Trees , 1979, JACM.

[12]  Paul F. Dietz Maintaining order in a linked list , 1982, STOC '82.

[13]  Philip Bille,et al.  Labeling schemes for small distances in trees , 2003, SODA '03.

[14]  Stephen Alstrup,et al.  Nearest common ancestors: a survey and a new distributed algorithm , 2002, SPAA.

[15]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[16]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[17]  Ran Raz,et al.  Distance labeling in graphs , 2001, SODA '01.

[18]  Haim Kaplan,et al.  Compact labeling schemes for ancestor queries , 2001, SODA '01.

[19]  Paul F. Dietz,et al.  Two algorithms for maintaining order in a list , 1987, STOC.

[20]  Mikkel Thorup,et al.  Direct routing on trees , 1998, SODA '98.

[22]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[23]  Richard Cole,et al.  Two Simplified Algorithms for Maintaining Order in a List , 2002, ESA.

[24]  Dan E. Willard,et al.  A Density Control Algorithm for Doing Insertions and Deletions in a Sequentially Ordered File in Good Worst-Case Time , 1992, Inf. Comput..

[25]  John M. Mellor-Crummey,et al.  On-the-fly detection of data races for programs with nested fork-join parallelism , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[26]  Paul F. Dietz,et al.  Lower Bounds for Monotonic List Labeling , 1990, SWAT.

[27]  Richard J. Anderson,et al.  Wait-free parallel algorithms for the union-find problem , 1991, STOC '91.

[28]  Charles E. Leiserson,et al.  Efficient Detection of Determinacy Races in Cilk Programs , 1997, SPAA '97.

[29]  Paul F. Dietz,et al.  A Tight Lower Bound for On-line Monotonic List Labeling , 1994, SWAT.

[30]  Charles E. Leiserson,et al.  Detecting data races in Cilk programs that use locks , 1998, SPAA '98.

[31]  C. Greg Plaxton,et al.  Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA '98.

[32]  Edith Schonberg,et al.  An empirical comparison of monitoring algorithms for access anomaly detection , 2011, PPOPP '90.

[33]  Alon Itai,et al.  A Sparse Table Implementation of Priority Queues , 1981, ICALP.

[34]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[35]  Dan E. Willard,et al.  Maintaining dense sequential files in a dynamic environment (Extended Abstract) , 1982, STOC '82.

[36]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[37]  Stephen Alstrup,et al.  Improved labeling scheme for ancestor queries , 2002, SODA '02.