Tiled Multicore Processors
暂无分享,去创建一个
Henry Hoffmann | Anant Agarwal | Volker Strumpen | David Wentzlaff | Matthew I. Frank | Jason E. Miller | Saman P. Amarasinghe | Walter Lee | Michael Bedford Taylor | Arvind Saraf | Ben Greenwald | Jason Sungtae Kim | James Psota | Nathan Shnidman | Ian Bratt | Paul R. Johnson | M. Taylor | D. Wentzlaff | A. Agarwal | Walter Lee | M. Frank | Jason E. Miller | H. Hoffmann | Ian Bratt | B. Greenwald | Paul R. Johnson | J. Kim | James Psota | A. Saraf | N. Shnidman | V. Strumpen
[1] Victor Lee,et al. The RAW benchmark suite: computation structures for general purpose computing , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[2] H. T. Kung,et al. The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.
[3] Henry Hoffmann,et al. Stream Algorithms and Architecture , 2004, J. Instr. Level Parallelism.
[4] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.
[5] Seth Copen Goldstein,et al. PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.
[6] Christoforos E. Kozyrakis,et al. A New Direction for Computer Architecture Research , 1998, Computer.
[7] Ken Mai,et al. The future of wires , 2001, Proc. IEEE.
[8] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[9] Jack Dongarra,et al. LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.
[10] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[11] Noah Treuhaft,et al. Scalable Processors in the Billion-Transistor Era: IRAM , 1997, Computer.
[12] Simha Sethumadhavan,et al. Distributed Microarchitectural Protocols in the TRIPS Prototype Processor , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).
[13] David Wentzlaff. Architectural implications of bit-level computation in communication applications , 2002 .
[14] Anant Agarwal,et al. Scalar operand networks: on-chip interconnect for ILP in partitioned architectures , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..
[15] R. Nagarajan,et al. A design space evaluation of grid processor architectures , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[16] David Shoemaker,et al. NuMesh: An architecture optimized for scheduled communication , 2004, The Journal of Supercomputing.
[17] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[18] Henry Hoffmann,et al. A stream compiler for communication-exposed architectures , 2002, ASPLOS X.
[19] Michael Taylor. Deionizer: A Tool for Capturing and Embedding I/O Cells , 2004 .
[20] H. Peter Hofstee,et al. Power efficient processor architecture and the cell processor , 2005, 11th International Symposium on High-Performance Computer Architecture.
[21] M. Bohr. Interconnect scaling-the real limiter to high performance ULSI , 1995, Proceedings of International Electron Devices Meeting.
[22] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.
[23] Doug Matzke,et al. Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.
[24] K. Yelick,et al. Generating Permutation Instructions from a High-Level Description , 2004 .
[25] Rajeev Barua,et al. Maps: a compiler-managed memory system for raw machines , 1999, ISCA.
[26] William J. Dally,et al. The Imagine Stream Processor , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.
[27] Pradip Bose,et al. Guest Editors' Introduction: Power and Complexity Aware Design , 2003, IEEE Micro.
[28] Jason E. Miller,et al. Software instruction caching , 2007 .
[29] Antonio González,et al. Modulo scheduling for a fully-distributed clustered VLIW architecture , 2000, MICRO 33.
[30] David G. Stork. Happy Birthday, HAL! , 1997, Computer.
[31] Michael I. Gordon,et al. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.
[32] Stephen P. Crago,et al. A performance analysis of PIM, stream processing, and tiled processing on memory-intensive signal processing kernels , 2003, ISCA '03.
[33] Vivek Sarkar,et al. Space-time scheduling of instruction-level parallelism on a raw machine , 1998, ASPLOS VIII.
[34] Henry Hoffmann,et al. On-Chip Interconnection Architecture of the Tile Processor , 2007, IEEE Micro.
[35] Anant Agarwal,et al. Scalar operand networks , 2005, IEEE Transactions on Parallel and Distributed Systems.
[36] Donald Yeung,et al. SimpleFit: A Framework for Analyzing Design Trade-Offs in Raw Architectures , 2001, IEEE Trans. Parallel Distributed Syst..
[37] John Wawrzynek,et al. Garp: a MIPS processor with a reconfigurable coprocessor , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[38] A. J. KleinOsowski,et al. MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.
[39] John Kubiatowicz,et al. Integrated shared-memory and message-passing communication in the Alewife multiprocessor , 1998 .
[40] Christopher Batten,et al. The Vector-Thread Architecture , 2004, ISCA 2004.
[41] William Thies,et al. StreamIt: A Language for Streaming Applications , 2002, CC.
[42] Vikas Agarwal,et al. Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[43] Samuel D. Naffziger,et al. The implementation of the next-generation 64b itanium microprocessor , 2002 .
[44] T. Gross,et al. !Warp-anatomy of a parallel computing system , 1999, IEEE Concurrency.
[45] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[46] Henry Hoffmann,et al. Evaluation of the Raw microprocessor: an exposed-wire-delay architecture for ILP and streams , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[47] Mark Stephenson,et al. Convergent scheduling , 2002, MICRO 35.