Hood: A user-level threads library for multiprogrammed multiprocessors

TheHood user-level threads library delivers efficient performance under multiprogramming without any need for kernel-level resource management, such as cosc heduling or process control. It does so by scheduling threads with a non-blocking implementation o f the work-stealing algorithm. With this implementation, the execution time of a program running wit h arbitrarily many processes on arbitrarily many processors can be modeled as a simple function of work an d critical-path length. This model holds even when the program runs on a set of processors that arbitra rily grows and shrinks over time. In all cases, we observe linear speedup whenever the number of proc esses is small relative to the parallelism.

[1]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[2]  Evangelos P. Markatos,et al.  First-class user-level threads , 1991, SOSP '91.

[3]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[4]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[5]  Raj Vaswani,et al.  A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.

[6]  Patrick Sobalvarro,et al.  Demand-Based Coscheduling of Parallel Jobs on Multiprogrammed Multiprocessors , 1995, JSSPP.

[7]  Edward W. Felten,et al.  Performance issues in non-blocking synchronization on shared-memory multiprocessors , 1992, PODC '92.

[8]  G ValiantLeslie A bridging model for parallel computation , 1990 .

[9]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[10]  Evangelos P. Markatos,et al.  Multiprogramming on multiprocessors , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[11]  David R. Keppel,et al.  Tools and Techniques for Building Fast Portable Threads Packages , 1993 .

[12]  Frank Mueller,et al.  A Library Implementation of POSIX Threads under UNIX , 1993, USENIX Winter.

[13]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[14]  Maged M. Michael,et al.  Relative performance of preemption-safe locking and non-blocking synchronization on multiprogrammed shared memory multiprocessors , 1997, Proceedings 11th International Parallel Processing Symposium.

[15]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[16]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[17]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[18]  Robert H. Halstead,et al.  Implementation of multilisp: Lisp on a multiprocessor , 1984, LFP '84.

[19]  C. Greg Plaxton,et al.  Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA '98.

[20]  Anoop Gupta,et al.  Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.

[21]  Andrea C. Arpaci-Dusseau,et al.  Effective distributed scheduling of parallel workloads , 1996, SIGMETRICS '96.

[22]  Andrea C. Arpaci-Dusseau,et al.  The interaction of parallel and sequential workloads on a network of workstations , 1995, SIGMETRICS '95/PERFORMANCE '95.

[23]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[24]  Brian N. Bershad,et al.  Practical considerations for non-blocking concurrent objects , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[25]  Devang Shah,et al.  Programming with threads , 1996 .

[26]  Raj Vaswani,et al.  The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors , 1991, SOSP '91.

[27]  Anoop Gupta,et al.  The impact of operating system scheduling policies and synchronization methods of performance of parallel applications , 1991, SIGMETRICS '91.

[28]  Ken Arnold,et al.  The Java Programming Language , 1996 .

[29]  Robert D. Blumofe,et al.  The performance of work stealing in multiprogrammed environments (extended abstract) , 1998, SIGMETRICS '98/PERFORMANCE '98.