Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams
暂无分享,去创建一个
Henry Hoffmann | Anant Agarwal | Volker Strumpen | David Wentzlaff | Matthew I. Frank | Jason E. Miller | Saman P. Amarasinghe | Walter Lee | Michael Bedford Taylor | Arvind Saraf | Ben Greenwald | James Psota | Nathan Shnidman | Ian Bratt | Paul R. Johnson | Jason Sungtae Kim | M. Taylor | D. Wentzlaff | A. Agarwal | Walter Lee | M. Frank | Jason E. Miller | H. Hoffmann | Ian Bratt | B. Greenwald | Paul R. Johnson | J. Kim | James Psota | A. Saraf | N. Shnidman | V. Strumpen
[1] H. T. Kung,et al. The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.
[2] Jack Dongarra,et al. LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.
[3] Anoop Gupta,et al. The Stanford Dash multiprocessor , 1992, Computer.
[4] Michael D. Noakes,et al. The J-machine multicomputer: an architectural evaluation , 1993, ISCA '93.
[5] M. Bohr. Interconnect scaling-the real limiter to high performance ULSI , 1995, Proceedings of International Electron Devices Meeting.
[6] Multiscalar processors , 1995, ISCA 1995.
[7] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[8] Victor Lee,et al. The RAW benchmark suite: computation structures for general purpose computing , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[9] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.
[10] John Wawrzynek,et al. Garp: a MIPS processor with a reconfigurable coprocessor , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[11] Doug Matzke,et al. Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.
[12] David R. O'Hallaron,et al. iWARP: Anatomy of a Parallel Computing System , 1998 .
[13] Christoforos E. Kozyrakis,et al. A New Direction for Computer Architecture Research , 1998, Computer.
[14] Vivek Sarkar,et al. Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.
[15] P. Bai,et al. A high performance 180 nm generation logic technology , 1998, International Electron Devices Meeting 1998. Technical Digest (Cat. No.98CH36217).
[16] John Kubiatowicz,et al. Integrated shared-memory and message-passing communication in the Alewife multiprocessor , 1998 .
[17] Seth Copen Goldstein,et al. PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.
[18] Rajeev Barua,et al. Maps: a compiler-managed memory system for raw machines , 1999, ISCA.
[19] B. Flietner,et al. 'System on a chip' technology platform for 0.18 /spl mu/m digital, mixed signal and eDRAM applications , 1999, International Electron Devices Meeting 1999. Technical Digest (Cat. No.99CH36318).
[20] Thorsten von Eicken,et al. 技術解説 IEEE Computer , 1999 .
[21] PipeRench: a coprocessor for streaming multimedia acceleration , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).
[22] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[23] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[24] Antonio González,et al. Modulo scheduling for a fully-distributed clustered VLIW architecture , 2000, MICRO 33.
[25] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[26] R. Nagarajan,et al. A design space evaluation of grid processor architectures , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[27] Donald Yeung,et al. SimpleFit: A Framework for Analyzing Design Trade-Offs in Raw Architectures , 2001, IEEE Trans. Parallel Distributed Syst..
[28] Ken Mai,et al. The future of wires , 2001, Proc. IEEE.
[29] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[30] A design space evaluation of grid processor architectures , 2001, MICRO.
[31] Mark Stephenson,et al. Convergent scheduling , 2002, MICRO 35.
[32] James E. Smith,et al. An instruction set and microarchitecture for instruction level distributed processing , 2002, ISCA.
[33] A. J. KleinOsowski,et al. MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.
[34] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[35] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[36] David Chinnery,et al. Closing the gap between ASIC & custom , 2002 .
[37] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[38] Matthew Mattina,et al. Tarantula: a vector extension to the alpha architecture , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.
[39] David Wentzlaff. Architectural implications of bit-level computation in communication applications , 2002 .
[40] Samuel D. Naffziger,et al. The implementation of the next-generation 64b itanium microprocessor , 2002 .
[41] William J. Dally,et al. The Imagine Stream Processor , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.
[42] Anant Agarwal,et al. Scalar operand networks: on-chip interconnect for ILP in partitioned architectures , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[43] David Wentzlaff,et al. Energy characterization of a tiled architecture processor with on-chip networks , 2003, ISLPED '03.
[44] Stephen P. Crago,et al. A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels , 2003, ISCA '03.
[45] Michael Taylor. Deionizer: A Tool for Capturing and Embedding I/O Cells , 2004 .
[46] Henry Hoffmann,et al. Stream Algorithms and Architecture , 2004, J. Instr. Level Parallelism.
[47] K. Yelick,et al. Generating Permutation Instructions from a High-Level Description , 2004 .
[48] Christopher Batten,et al. The vector-thread architecture , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[49] Anant Agarwal,et al. Scalar Operand Networks: Design, Implementation, and Analysis , 2004 .
[50] David Shoemaker,et al. NuMesh: An architecture optimized for scheduled communication , 2004, The Journal of Supercomputing.