Architectural Support for the Stream Execution Model on General-Purpose Processors
暂无分享,去创建一个
William J. Dally | Mattan Erez | Mendel Rosenblum | Joel Coburn | Jayanth Gummaraju | W. Dally | M. Erez | M. Rosenblum | Joel Coburn | J. Gummaraju
[1] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..
[2] Daehyun Kim,et al. Architectural support for uniprocessor and multiprocessor active memory systems , 2004, IEEE Transactions on Computers.
[3] Sally A. McKee,et al. A memory controller for improved performance of streamed computations on symmetric multiprocessors , 1996, Proceedings of International Conference on Parallel Processing.
[4] H. Peter Hofstee,et al. Power efficient processor architecture and the cell processor , 2005, 11th International Symposium on High-Performance Computer Architecture.
[5] Aamer Jaleel,et al. DRAMsim: a memory system simulator , 2005, CARN.
[6] P. Hanrahan,et al. Sequoia: Programming the Memory Hierarchy , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[7] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.
[8] Jung Ho Ahn,et al. Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).
[9] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[10] M. Horowitz,et al. The stream virtual machine , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[11] William J. Dally,et al. Imagine: Media Processing with Streams , 2001, IEEE Micro.
[12] James Demmel,et al. Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[13] Timothy J. Barth,et al. High-order methods for computational physics , 1999 .
[14] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[15] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[16] Michael Gschwind,et al. Optimizing Compiler for the CELL Processor , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[17] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[18] Jung Ho Ahn,et al. The Design Space of Data-Parallel Memory Systems , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[19] James E. Smith,et al. Data Cache Prefetching Using a Global History Buffer , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).
[20] Wilson C. Hsieh,et al. Impulse: Memory system support for scientific applications , 1999, Sci. Program..
[21] Mateo Valero,et al. Adding a vector unit to a superscalar processor , 1999, ICS '99.
[22] Krishnan Mahesh,et al. Large-Eddy Simulation of Reacting Turbulent Flows in Complex Geometries , 2006 .
[23] William J. Dally,et al. Exploring the VLSI scalability of stream processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[24] Timothy J. Barth,et al. Simplified Discontinuous Galerkin Methods for Systems of Conservation Laws with Convex Extension , 2000 .
[25] Mendel Rosenblum,et al. Stream programming on general-purpose processors , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[26] William J. Dally,et al. Scatter-add in data parallel architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.
[27] William J. Dally,et al. Programmable Stream Processors , 2003, Computer.
[28] Nathan L. Binkert,et al. Network-Oriented Full-System Simulation using M5 , 2003 .
[29] Ken Kennedy,et al. Software prefetching , 1991, ASPLOS IV.
[30] M. Itskov,et al. Constitutive model and finite element formulation for large strain elasto-plastic analysis of shells , 1999 .
[31] Josep Torrellas,et al. Using a user-level memory thread for correlation prefetching , 2002, ISCA.
[32] Mark D. Hill,et al. Surpassing the TLB performance of superpages with less operating system support , 1994, ASPLOS VI.