Lithe: enabling efficient composition of parallel libraries

For the software industry to take advantage of multicore processors, we must allow programmers to arbitrarily compose parallel libraries without sacrificing performance. We argue that high-level task or thread abstractions and a common global scheduler cannot provide effective library composition. Instead, the operating system should expose unvirtualized processing resources that can be shared cooperatively between parallel libraries within an application. In this paper, we describe a system that standardizes and facilitates the exchange of these unvirtualized processing resources between libraries.

[1]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[2]  B. Belkhouche,et al.  Acknowledgements We Would like to Thank , 1993 .

[3]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[4]  Laxmikant V. Kalé,et al.  Threads for Interoperable Parallel Programming , 1996, LCPC.

[5]  Bryan Ford,et al.  CPU inheritance scheduling , 1996, OSDI '96.

[6]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[7]  Andrew A. Chien,et al.  A Hierarchical Load-Balancing Framework for Dynamic Multithreaded Computations , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[8]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[9]  Jack J. Dongarra,et al.  Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead , 2006, PARA.

[10]  James Reinders,et al.  Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .

[11]  Katherine A. Yelick,et al.  Multi-threading and one-sided communication in parallel LU factorization , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[12]  John H. Reppy,et al.  A scheduling framework for general-purpose parallel languages , 2008, ICFP.

[13]  Kevin Klues,et al.  Tessellation: space-time partitioning in a manycore client OS , 2009 .