Adaptive Cache Aware Bitier Work-Stealing in Multisocket Multicore Architectures
暂无分享,去创建一个
Quan Chen | Minyi Guo | Zhiyi Huang | Quan Chen | M. Guo | Zhiyi Huang
[1] Tao Yang,et al. A Comparison of Clustering Heuristics for Scheduling Directed Acycle Graphs on Multiprocessors , 1992, J. Parallel Distributed Comput..
[2] Jens Palsberg,et al. Featherweight X10: a core calculus for async-finish parallelism , 2010, PPoPP '10.
[3] Michael Stumm,et al. Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.
[4] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[5] M. Berger,et al. Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .
[6] Richard Cole,et al. Analysis of Randomized Work Stealing with False Sharing , 2011, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[7] Mark Moir,et al. A dynamic-sized nonblocking work stealing deque , 2006, Distributed Computing.
[8] David R. Butenhof. Programming with POSIX threads , 1993 .
[9] Xiaoning Ding,et al. ULCC: a user-level facility for optimizing shared cache performance on multicores , 2011, PPoPP '11.
[10] Sebastian Burckhardt,et al. The design of a task parallel library , 2009, OOPSLA.
[11] Yi Guo,et al. Work-first and help-first scheduling policies for async-finish task parallelism , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[12] Robert D. Blumofe,et al. Executing multithreaded programs efficiently , 1995 .
[13] Alejandro Duran,et al. The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.
[14] Guy E. Blelloch,et al. Scheduling threads for constructive cache sharing on CMPs , 2007, SPAA '07.
[15] Doug Lea,et al. A Java fork/join framework , 2000, JAVA '00.
[16] Charles E. Leiserson,et al. The Cilk++ concurrency platform , 2009, 2009 46th ACM/IEEE Design Automation Conference.
[17] Quan Chen,et al. CAB: Cache Aware Bi-tier Task-Stealing in Multi-socket Multi-core Architecture , 2011, 2011 International Conference on Parallel Processing.
[18] Wenguang Chen,et al. Maotai: View-Oriented Parallel Programming on CMT Processors , 2008, 2008 37th International Conference on Parallel Processing.
[19] David Chase,et al. Dynamic circular work-stealing deque , 2005, SPAA '05.
[20] Yi Guo,et al. SLAW: A scalable locality-aware adaptive work-stealing scheduler , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[21] Guy E. Blelloch,et al. Provably good multicore cache performance for divide-and-conquer algorithms , 2008, SODA '08.
[22] Maged M. Michael,et al. Idempotent work stealing , 2009, PPoPP '09.
[23] Guy E. Blelloch,et al. Low depth cache-oblivious algorithms , 2010, SPAA '10.
[24] Frédéric Wagner,et al. Hierarchical Work-Stealing , 2010, Euro-Par.
[25] Nir Shavit,et al. Non-blocking steal-half work queues , 2002, PODC '02.
[26] James Reinders,et al. Intel® threading building blocks , 2008 .
[27] Guy E. Blelloch,et al. Scheduling irregular parallel computations on hierarchical caches , 2011, SPAA '11.
[28] Quan Chen,et al. CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures , 2012, ICS '12.
[29] Yi Guo,et al. SLAW: A scalable locality-aware adaptive work-stealing scheduler , 2010, IPDPS.
[30] Lei Wang,et al. An adaptive task creation strategy for work-stealing scheduling , 2010, CGO '10.
[31] Guy E. Blelloch,et al. The Data Locality of Work Stealing , 2002, SPAA '00.