Architectural Support for Exploiting Fine Grain Parallelism
暂无分享,去创建一个
Mikel Luján | Ian Watson | Isuru Herath | Demian Rosas-Ham | Paraskevas Yiapanis | M. Luján | I. Watson | Isuru Herath | Paraskevas Yiapanis | Demian Rosas-Ham
[1] John Paul Shen,et al. Multiple Instruction Stream Processor , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[2] Karl-Filip Faxén. Efficient Work Stealing for Fine Grained Parallelism , 2010, 2010 39th International Conference on Parallel Processing.
[3] Robert H. Halstead,et al. Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.
[4] Niraj K. Jha,et al. Garnet : A Detailed Interconnect Model Inside a Full-System Simulation Framework , .
[5] Charles E. Leiserson,et al. The Cilk++ concurrency platform , 2009, 2009 46th ACM/IEEE Design Automation Conference.
[6] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.
[7] Alejandro Duran,et al. The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.
[8] Milo M. K. Martin,et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.
[9] Paraskevas Evripidou,et al. TFlux: A Portable Platform for Data-Driven Multithreading on Commodity Multicore Systems , 2008, 2008 37th International Conference on Parallel Processing.
[10] Akinori Yonezawa,et al. StackThreads/MP: integrating futures into calling standards , 1999, PPoPP '99.
[11] Sebastian Burckhardt,et al. The design of a task parallel library , 2009, OOPSLA.
[12] Fredrik Larsson,et al. Simics: A Full System Simulation Platform , 2002, Computer.
[13] Yi Guo,et al. SLAW: A scalable locality-aware adaptive work-stealing scheduler , 2010, IPDPS.
[14] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[15] Christoforos E. Kozyrakis,et al. Flexible architectural support for fine-grain scheduling , 2010, ASPLOS XV.
[16] Doug Lea,et al. A Java fork/join framework , 2000, JAVA '00.
[17] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[18] Kathryn S. McKinley,et al. Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.
[19] Yi Guo,et al. SLAW: A scalable locality-aware adaptive work-stealing scheduler , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[20] Seth Copen Goldstein,et al. Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[21] Sanjay J. Patel,et al. Rigel: an architecture and scalable programming interface for a 1000-core accelerator , 2009, ISCA '09.
[22] Andrei Sergeevich Terechko,et al. A Hardware Task Scheduler for Embedded Video Processing , 2008, HiPEAC.
[23] Mats Brorsson,et al. A Comparison of some recent Task-based Parallel Programming Models , 2010 .
[24] Margaret Martonosi,et al. Hardware-modulated parallelism in chip multiprocessors , 2005, CARN.
[25] Magnus Själander,et al. A Look-Ahead Task Management Unit for Embedded Multi-Core Architectures , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.
[26] Alexey Kukanov,et al. The Foundations for Scalable Multicore Software in Intel Threading Building Blocks , 2007 .
[27] Hong Jiang,et al. Pangaea: A tightly-coupled IA32 heterogeneous chip multiprocessor , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[28] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[29] Eduard Ayguadé,et al. Task Superscalar: An Out-of-Order Task Pipeline , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.