Composing parallel software efficiently with lithe
暂无分享,去创建一个
[1] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.
[2] Mitchell Wand,et al. Continuation-Based Multiprocessing , 1980, High. Order Symb. Comput..
[3] Anoop Gupta,et al. Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.
[4] Lawrence W. Dowdy,et al. Dynamic partitioning in a transputer environment , 1990, SIGMETRICS '90.
[5] Evangelos P. Markatos,et al. First-class user-level threads , 1991, SOSP '91.
[6] Brian N. Bershad,et al. Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.
[7] Raj Vaswani,et al. A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.
[8] William E. Weihl,et al. Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.
[9] Jack L. Lo,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[10] Laxmikant V. Kalé,et al. Threads for Interoperable Parallel Programming , 1996, LCPC.
[11] Seth Copen Goldstein,et al. Lazy Threads: Implementing a Fast Parallel Call , 1996, J. Parallel Distributed Comput..
[12] Dean M. Tullsen,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[13] Bryan Ford,et al. CPU inheritance scheduling , 1996, OSDI '96.
[14] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[15] Richard J. Enbody,et al. Comparing gang scheduling with dynamic space sharing on symmetric multiprocessors using automatic self-allocating threads (ASAT) , 1997, Proceedings 11th International Parallel Processing Symposium.
[16] Rohit Chandra,et al. Parallel programming in openMP , 2000 .
[17] John Regehr,et al. Using hierarchical scheduling to support soft real-time applications in general-purpose operating systems , 2001 .
[18] John H. Reppy,et al. Compiler support for lightweight concurrency , 2002 .
[19] Marvin Theimer,et al. Cooperative Task Management Without Manual Stack Management , 2002, USENIX Annual Technical Conference, General Track.
[20] George C. Necula,et al. Capriccio: scalable threads for internet services , 2003, SOSP '03.
[21] Ravi R. Iyer,et al. CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.
[22] James Reinders,et al. Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .
[23] Simon L. Peyton Jones,et al. Lightweight concurrency primitives for GHC , 2007, Haskell '07.
[24] Samuel Williams,et al. Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[25] Katherine A. Yelick,et al. Multi-threading and one-sided communication in parallel LU factorization , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[26] Guang R. Gao,et al. A parallel dynamic programming algorithm on a multi-core architecture , 2007, SPAA '07.
[27] John H. Reppy,et al. A scheduling framework for general-purpose parallel languages , 2008, ICFP.
[28] Krste Asanovic,et al. Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks , 2008, 2008 International Symposium on Computer Architecture.
[29] Timothy Roscoe,et al. 30 seconds is not enough!: a study of operating system timer usage , 2008, Eurosys '08.
[30] Christopher Hughes,et al. Scalable HMM based inference engine in large vocabulary continuous speech recognition , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[31] Katherine Yelick,et al. Optimizing collective communication on multicores , 2009 .
[32] Lapack Working. Scheduling Linear Algebra Operations on Multicore Processors – , 2009 .
[33] Roberto Ierusalimschy,et al. Revisiting coroutines , 2009, TOPL.
[34] Kevin Klues,et al. Tessellation: space-time partitioning in a manycore client OS , 2009 .
[35] Timothy A. Davis,et al. Multifrontral multithreaded rank-revealing sparse QR factorization , 2009, Combinatorial Scientific Computing.
[36] Jack Dongarra,et al. Scheduling dense linear algebra operations on multicore processors , 2010 .
[37] Timothy A. Davis,et al. Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization , 2011, TOMS.