Optimizing Work Stealing Communication with Structured Atomic Operations
暂无分享,去创建一个
[1] Richard F. Barrett,et al. Scheduling Chapel Tasks with Qthreads on Manycore: A Tale of Two Schedulers , 2017, ROSS@HPDC.
[2] Olivier Tardieu,et al. A work-stealing scheduler for X10's task parallelism with suspension , 2012, PPoPP '12.
[3] Y.-K. Kwok,et al. Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.
[4] Sriram Krishnamoorthy,et al. Scalable work stealing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[5] Guy E. Blelloch,et al. A provable time and space efficient implementation of NESL , 1996, ICFP '96.
[6] Sriram Krishnamoorthy,et al. Lifeline-based global load balancing , 2011, PPoPP '11.
[7] Laxmikant V. Kalé,et al. Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers , 2010, 2010 39th International Conference on Parallel Processing Workshops.
[8] Vivek Sarkar,et al. SLAW: A scalable locality-aware adaptive work-stealing scheduler , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[9] Arthur Charguéraud,et al. Scheduling parallel programs by work stealing with private deques , 2013, PPoPP '13.
[10] Benoît Meister,et al. The Open Community Runtime: A runtime system for extreme scale computing , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).
[11] Alexander Aiken,et al. Legion: Expressing locality and independence with logical regions , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] Sriram Krishnamoorthy,et al. Scioto: A Framework for Global-View Task Parallelism , 2008, 2008 37th International Conference on Parallel Processing.
[13] Robert D. Blumofe,et al. Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .
[14] Vivek Sarkar,et al. Work-First and Help-First Scheduling Policies for Terminally Strict Parallel Programs , 2008 .
[15] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[16] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[17] D. Brian Larkins,et al. Accelerated Work Stealing , 2019, ICPP.
[18] Laxmikant V. Kalé,et al. A load balancing strategy for prioritized execution of tasks , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.
[19] Maged M. Michael,et al. Idempotent work stealing , 2009, PPoPP '09.
[20] William J. Knottenbelt,et al. Parallel multilevel algorithms for hypergraph partitioning , 2008, J. Parallel Distributed Comput..
[21] Ümit V. Çatalyürek,et al. Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[22] Bryan Carpenter,et al. ARMCI: A Portable Remote Memory Copy Libray for Ditributed Array Libraries and Compiler Run-Time Systems , 1999, IPPS/SPDP Workshops.
[23] Chau-Wen Tseng,et al. A message passing benchmark for unbalanced applications , 2008, Simul. Model. Pract. Theory.
[24] Brian W. Barrett,et al. The Portals 4.3 Network Programming Interface , 2014 .
[25] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[26] Vivek Sarkar,et al. Optimized Distributed Work-Stealing , 2016, 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3).
[27] Nir Shavit,et al. Non-blocking steal-half work queues , 2002, PODC '02.
[28] Michael Lang,et al. Optimizing load balancing and data-locality with data-aware scheduling , 2014, 2014 IEEE International Conference on Big Data (Big Data).
[29] Vipin Kumar,et al. Scalable Load Balancing Techniques for Parallel Computers , 1994, J. Parallel Distributed Comput..