Scheduling parallel programs by work stealing with private deques
暂无分享,去创建一个
[1] F. Warren Burton,et al. Executing functional programs on a virtual tree of processors , 1981, FPCA '81.
[2] Robert H. Halstead,et al. Implementation of multilisp: Lisp on a multiprocessor , 1984, LFP '84.
[3] Edward D. Lazowska,et al. A Comparison of Receiver-Initiated and Sender-Initiated Adaptive Load Sharing , 1986, Perform. Evaluation.
[4] Donald F. Towsley,et al. Analysis of the Effects of Delays on Load Sharing , 1989, IEEE Trans. Computers.
[5] Eli Upfal,et al. A simple load balancing scheme for task allocation in parallel machines , 1991, SPAA '91.
[6] Marc Feeley,et al. A Message Passing Implementation of Lazy Task Creation , 1992, Parallel Symbolic Computing.
[7] Marc Feeley,et al. An efficient and general implementation of futures on large scale shared-memory multiprocessors , 1993 .
[8] Marc Feeley. Polling efficiently on stock hardware , 1993, FPCA '93.
[9] Guy E. Blelloch,et al. A provable time and space efficient implementation of NESL , 1996, ICFP '96.
[10] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[11] Sivarama P. Dandamudi. The effect of scheduling discipline on dynamic load sharing in heterogeneous distributed systems , 1997, Proceedings Fifth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.
[12] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[13] C. Greg Plaxton,et al. Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA '98.
[14] Michael Mitzenmacher,et al. Analyses of load stealing models based on differential equations , 1998, SPAA '98.
[15] C. Leiserson,et al. Scheduling multithreaded computations by work stealing , 1999, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[16] Guy E. Blelloch,et al. The Data Locality of Work Stealing , 2002, SPAA '00.
[17] Nir Shavit,et al. Work dealing , 2002, SPAA '02.
[18] Nir Shavit,et al. Non-blocking steal-half work queues , 2002, PODC '02.
[19] Peter Sanders,et al. Randomized Receiver Initiated Load-balancing Algorithms for Tree-shaped Computations , 2002, Comput. J..
[20] Taiichi Yuasa,et al. Pursuing Laziness for Efficient Implementation of Modern Multithreaded Languages , 2003, ISHPC.
[21] Leslie Ann Goldberg,et al. The Natural Work-Stealing Algorithm is Stable , 2001, SIAM J. Comput..
[22] Mark Moir,et al. A dynamic-sized nonblocking work stealing deque , 2006, Distributed Computing.
[23] David Chase,et al. Dynamic circular work-stealing deque , 2005, SPAA '05.
[24] Stephen L. Olivier,et al. Dynamic Load Balancing of Unbalanced Computations Using Message Passing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[25] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.
[26] Sriram Krishnamoorthy,et al. Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing , 2008, 2008 37th International Conference on Parallel Processing.
[27] Guy E. Blelloch,et al. Provably good multicore cache performance for divide-and-conquer algorithms , 2008, SODA '08.
[28] John H. Reppy,et al. Implicitly-threaded parallelism in Manticore , 2008, ICFP 2008.
[29] Sriram Krishnamoorthy,et al. Scalable work stealing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[30] Taiichi Yuasa,et al. Backtracking-based load balancing , 2009, PPoPP '09.
[31] Maged M. Michael,et al. Idempotent work stealing , 2009, PPoPP '09.
[32] Denis Trystram,et al. A Tighter Analysis of Work Stealing , 2010, ISAAC.
[33] Simon L. Peyton Jones,et al. Regular, shape-polymorphic, parallel arrays in Haskell , 2010, ICFP '10.
[34] Christoforos E. Kozyrakis,et al. Flexible architectural support for fine-grain scheduling , 2010, ASPLOS XV.
[35] David Cunningham,et al. A performance model for X10 applications: what's going on under the hood? , 2011, X10 '11.
[36] Alexandros Tzannes. Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming , 2012 .
[37] Guy E. Blelloch,et al. Internally deterministic parallel algorithms can be fast , 2012, PPoPP '12.
[38] Scheduling parallel programs by work stealing with private deques , 2013, PPOPP.