Fork/join parallelism in the wild: documenting patterns and anti-patterns in Java programs using the fork/join framework

Now that multicore processors are commonplace, developing parallel software has escaped the confines of high-performance computing and enters the mainstream. The Fork/Join framework, for instance, is part of the standard Java platform since version 7. Fork/Join is a high-level parallel programming model advocated to make parallelizing recursive divide-and-conquer algorithms particularly easy. While, in theory, Fork/Join is a simple and effective technique to expose parallelism in applications, it has not been investigated before whether and how the technique is applied in practice. We therefore performed an empirical study on a corpus of 120 open source Java projects that use the framework for roughly 362 different tasks. On the one hand, we confirm the frequent use of four best-practice patterns (Sequential Cutoff, Linked Subtasks, Leaf Tasks, and avoiding unnecessary forking) in actual projects. On the other hand, we also discovered three recurring anti-patterns that potentially limit parallel performance: sub-optimal use of Java collections when splitting tasks into subtasks as well as when merging the results of subtasks, and finally the inappropriate sharing of resources between tasks. We document these anti-patterns and study their impact on performance.

[1]  Doug Lea,et al.  A Java fork/join framework , 2000, JAVA '00.

[2]  Mark A. Moraes,et al.  Parallel random numbers: As easy as 1, 2, 3 , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[3]  Ewan D. Tempero,et al.  Multiple dispatch in practice , 2008, OOPSLA.

[4]  Jonathan Walpole,et al.  Is Parallel Programming Hard, And If So, Why? , 2009 .

[5]  Michael L. Scott,et al.  False sharing and its effect on shared memory performance , 1993 .

[6]  Charles E. Leiserson,et al.  Deterministic parallel random-number generation for dynamic-multithreading platforms , 2012, PPoPP '12.

[7]  Yu Lin,et al.  CHECK-THEN-ACT Misuse of Java Concurrent Collections , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[8]  Amer Diwan,et al.  Wake up and smell the coffee: evaluation methodology for the 21st century , 2008, CACM.

[9]  Danny Dig,et al.  How do developers use parallel libraries? , 2012, SIGSOFT FSE.

[10]  Ralph E. Johnson,et al.  Why Do Scala Developers Mix the Actor Model with other Concurrency Models? , 2013, ECOOP.

[11]  Doug Lea Concurrent Programming in Java. Second Edition: Design Principles and Patterns , 1999 .

[12]  Doug Lea,et al.  Concurrent programming in Java - design principles and patterns , 1996, Java series.

[13]  Guy L. Steele,et al.  Organizing functional code for parallel execution or, foldl and foldr considered slightly harmful , 2009, ICFP.

[14]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[15]  Ewan D. Tempero,et al.  What Programmers Do with Inheritance in Java , 2013, ECOOP.

[16]  C. H. Flood,et al.  The Fortress Language Specification , 2007 .

[17]  James R. Larus,et al.  Software and the Concurrency Revolution , 2005, ACM Queue.

[18]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[19]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[20]  David Grove,et al.  Work-stealing without the baggage , 2012, OOPSLA '12.

[21]  Bradford L. Chamberlain,et al.  Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..

[22]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[23]  Jing Li,et al.  The Qualitas Corpus: A Curated Collection of Java Code for Empirical Studies , 2010, 2010 Asia Pacific Software Engineering Conference.

[24]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.