Dynamic Scheduling of Irregular Stream Programs toward Many-Core Scalability
暂无分享,去创建一个
[1] Thomas R. Gross,et al. Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead , 2011, ISMM '11.
[2] Selim G. Akl,et al. Optimal Parallel Merging and Sorting Without Memory Conflicts , 1987, IEEE Transactions on Computers.
[3] Luca P. Carloni,et al. Flexible filters: load balancing through backpressure for stream programs , 2009, EMSOFT '09.
[4] Ying Xing,et al. The Design of the Borealis Stream Processing Engine , 2005, CIDR.
[5] Michael L. Scott,et al. Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.
[6] Sriram Krishnamoorthy,et al. Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing , 2008, 2008 37th International Conference on Parallel Processing.
[7] Keshav Pingali,et al. The tao of parallelism in algorithms , 2011, PLDI '11.
[8] William Thies,et al. An empirical characterization of stream programs and its implications for language and compiler design , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[9] Scott A. Mahlke,et al. Sponge: portable stream programming on graphics engines , 2011, ASPLOS XVI.
[10] David E. Culler,et al. SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.
[11] Leonardo Neumeyer,et al. S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.
[12] E.A. Lee,et al. Synchronous data flow , 1987, Proceedings of the IEEE.
[13] Alexandra Fedorova,et al. A case for NUMA-aware contention management on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[14] Albert Cohen,et al. OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs , 2012, TACO.
[15] Kun-Lung Wu,et al. Elastic scaling of data parallel operators in stream processing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[16] Jong-Deok Choi,et al. An OpenCL framework for heterogeneous multicores with local memory , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[17] William J. Dally,et al. Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures , 2010, SPAA '10.
[18] Arun Raman,et al. Parallelism orchestration using DoPE: the degree of parallelism executive , 2011, PLDI '11.
[19] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[20] Anjul Patney,et al. Task management for irregular-parallel workloads on the GPU , 2010, HPG '10.
[21] Yale N. Patt,et al. Feedback-directed pipeline parallelism , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[22] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[23] Scott A. Mahlke,et al. Orchestrating the execution of stream programs on multicore platforms , 2008, PLDI '08.
[24] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[25] Michael I. Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures , 2010 .
[26] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[27] Robert Tappan Morris,et al. An Analysis of Linux Scalability to Many Cores , 2010, OSDI.
[28] Dirk Grunwald,et al. Generating, optimizing, and scheduling a compiler level representation of stream parallelism , 2011 .
[29] Navendu Jain,et al. Adaptive Control of Extreme-scale Stream Processing Systems , 2006, 26th IEEE International Conference on Distributed Computing Systems (ICDCS'06).
[30] Robert Morris,et al. Non-scalable locks are dangerous , 2012 .
[31] Pat Hanrahan,et al. GRAMPS: A programming model for graphics pipelines , 2009, ACM Trans. Graph..
[32] Shreekant S. Thakkar,et al. Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.
[33] Christoforos E. Kozyrakis,et al. Dynamic Fine-Grain Scheduling of Pipeline Parallelism , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.
[34] Thomas E. Anderson,et al. The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..