StreamTMC: Stream compilation for tiled multi-core architectures
暂无分享,去创建一个
Haitao Wei | Mingkang Qin | Weiwei Zhang | Junqing Yu | Dongrui Fan | G. Gao | Junqing Yu | Dongrui Fan | Haitao Wei | Mingkang Qin | Weiwei Zhang
[1] Trevor Mudge,et al. MacroSS: macro-SIMDization of streaming applications , 2010, ASPLOS 2010.
[2] Scott A. Mahlke,et al. Sponge: portable stream programming on graphics engines , 2011, ASPLOS XVI.
[3] Michael I. Gordon,et al. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.
[4] Dongrui Fan,et al. High Performance Matrix Multiplication on Many Cores , 2009, Euro-Par.
[5] Guang R. Gao,et al. Minimizing communication in rate-optimal software pipelining for stream programs , 2010, CGO '10.
[6] Saurabh Dighe,et al. An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.
[7] Guang R. Gao,et al. Experience on optimizing irregular computation for memory hierarchy in manycore architecture , 2008, ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming.
[8] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.
[9] Scott A. Mahlke,et al. Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[10] David Kirk,et al. NVIDIA cuda software and gpu parallel computing architecture , 2007, ISMM '07.
[11] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.
[12] Edward A. Lee,et al. Synthesis of Embedded Software from Synchronous Dataflow Specifications , 1999, J. VLSI Signal Process..
[13] Abhishek Udupa,et al. Software Pipelined Execution of Stream Programs on GPUs , 2009, 2009 International Symposium on Code Generation and Optimization.
[14] Guang R. Gao,et al. Software Pipelining for Stream Programs on Resource Constrained Multicore Architectures , 2012, IEEE Transactions on Parallel and Distributed Systems.
[15] Edward A. Lee,et al. Compile-Time Scheduling and Assignment of Data-Flow Program Graphs with Data-Dependent Iteration , 1991, IEEE Trans. Computers.
[16] William Thies,et al. Cache aware optimization of stream programs , 2005, LCTES.
[17] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[18] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[19] Scott A. Mahlke,et al. MacroSS: macro-SIMDization of streaming applications , 2010, ASPLOS XV.
[20] William Thies,et al. Phased scheduling of stream programs , 2003 .
[21] Michael F. P. O'Boyle,et al. Partitioning streaming parallelism for multi-cores: A machine learning based approach , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[22] Scott A. Mahlke,et al. Orchestrating the execution of stream programs on multicore platforms , 2008, PLDI '08.
[23] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[24] Edward A. Lee,et al. A HIERARCHICAL MULTIPROCESSOR SCHEDULING FRAMEWORK FOR SYNCHRONOUS DATAFLOW GRAPHS , 1995 .
[25] William J. Dally,et al. Design tradeoffs for tiled CMP on-chip networks , 2006, ICS '06.
[26] Henry Hoffmann,et al. On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.
[27] Hong Song,et al. A Programming Model for an Embedded Media Processing Architecture , 2005, SAMOS.