Tradeoff between data-, instruction-, and thread-level parallelism in stream processors
暂无分享,去创建一个
[1] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[2] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, ISCA.
[3] Iain E. G. Richardson,et al. H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .
[4] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.
[5] Jung Ho Ahn,et al. The Design Space of Data-Parallel Memory Systems , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[6] Mattan Erez,et al. Merrimac-high-performance and highly-efficient scientific computing with streams , 2006 .
[7] Shreekant S. Thakkar,et al. Internet Streaming SIMD Extensions , 1999, Computer.
[8] Christoforos E. Kozyrakis,et al. Overcoming the limitations of conventional vector processors , 2003, ISCA '03.
[9] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[10] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[11] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[12] Yale N. Patt,et al. One Billion Transistors, One Uniprocessor, One Chip , 1997, Computer.
[13] Anastasis A. Sofokleous,et al. Review: H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia , 2005, Comput. J..
[14] Fred Weber,et al. AMD 3DNow! technology: architecture and implementations , 1999, IEEE Micro.
[15] William J. Dally,et al. Exploring the VLSI scalability of stream processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[16] William J. Dally,et al. Stream register files with indexed access , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[17] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.
[18] William J. Dally,et al. Imagine: Media Processing with Streams , 2001, IEEE Micro.
[19] S. Asano,et al. The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..
[20] Jung Ho Ahn,et al. Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[21] Edward A. Lee,et al. Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.
[22] Noah Treuhaft,et al. Scalable Processors in the Billion-Transistor Era: IRAM , 1997, Computer.
[23] William J. Dally,et al. Register organization for media processing , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[24] Jung Ho Ahn,et al. Memory and control organizations of stream processors , 2007 .
[25] William J. Dally,et al. Analysis and Performance Results of a Molecular Modeling Application on Merrimac , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[26] J. W. Backus,et al. Can programming be liberated from the von Neumann style , 1977 .
[27] Quinn Jacobson,et al. Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[28] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture , 2003, IEEE Micro.
[29] Christopher Batten,et al. The vector-thread architecture , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[30] William J. Dally,et al. Programmable Stream Processors , 2003, Computer.
[31] William J. Dally,et al. Conditional techniques for stream processing kernels , 2004 .
[32] BackusJohn. Can programming be liberated from the von Neumann style , 1978 .
[33] John W. Backus,et al. Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs , 1978, CACM.
[34] Kunle Olukotun,et al. The Stanford Hydra CMP , 2000, IEEE Micro.
[35] William J. Dally,et al. Communication Scheduling , 2000, ASPLOS.
[36] Luiz André Barroso,et al. Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).