High-Bandwidth Address Generation Unit
暂无分享,去创建一个
[1] David J. Kuck,et al. The Burroughs Scientific Processor (BSP) , 1982, IEEE Transactions on Computers.
[2] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[3] David T. Harper,et al. Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems , 1991, IEEE Trans. Parallel Distributed Syst..
[4] Mitsumasa Koyanagi,et al. A new multiport memory for high performance parallel processor system with shared memory , 1998, Proceedings of 1998 Asia and South Pacific Design Automation Conference.
[5] Eduard Ayguadé,et al. Conflict-Free Access for Streams in Multimodule Memories , 1995, IEEE Trans. Computers.
[6] Richard M. Russell,et al. The CRAY-1 computer system , 1978, CACM.
[7] Stamatis Vassiliadis,et al. Reconfigurable Fixed Point Dense and Sparse Matrix-Vector Multiply/Add Unit , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).
[8] Kai Hwang,et al. Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.
[9] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..
[10] Stamatis Vassiliadis,et al. Reconfigurable Multiple Operation Array , 2005, SAMOS.
[11] Mateo Valero,et al. Exploiting instruction- and data-level parallelism , 1997, IEEE Micro.
[12] C. John Glossner,et al. Instruction set extensions for software defined radio on a multithreaded processor , 2005, CASES '05.
[13] David Abramson,et al. Automated synthesis of interleaved memory systems for custom computing machines , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).
[14] David T. Harper,et al. Increased Memory Performance During Vector Accesses Through the use of Linear Address Transformations , 1992, IEEE Trans. Computers.
[15] Gurindar S. Sohi. High-Bandwidth Interleaved Memories for Vector Processors-A Simulation Study , 1993, IEEE Trans. Computers.
[16] Jong Won Park. An Efficient Buffer Memory System for Subarray Access , 2001, IEEE Trans. Parallel Distributed Syst..
[17] Sally A. McKee,et al. Design of a parallel vector access unit for SDRAM memory systems , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[18] Mateo Valero,et al. Vector architectures: past, present and future , 1998, ICS '98.
[19] Paul Budnik,et al. The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.
[20] Duncan H. Lawrie,et al. The Prime Memory System for Array Access , 1982, IEEE Transactions on Computers.
[21] David T. Harper,et al. Conflict-Free Vector Access Using a Dynamic Storage Scheme , 1991, IEEE Trans. Computers.
[22] André Seznec,et al. Interleaved Parallel Schemes , 1994, IEEE Trans. Parallel Distributed Syst..
[23] Stamatis Vassiliadis,et al. The MOLEN polymorphic processor , 2004, IEEE Transactions on Computers.
[24] Jong Won Park. Multiaccess Memory System for Attached SIMD Computer , 2004, IEEE Trans. Computers.
[25] Stamatis Vassiliadis,et al. Multimedia rectangularly addressable memory , 2006, IEEE Transactions on Multimedia.
[26] Sanu Mathew,et al. A 9-GHz 65-nm Intel® Pentium 4 Processor Integer Execution Unit , 2007, IEEE J. Solid State Circuits.
[27] M.H. Sunwoo,et al. Design of address generation unit for audio DSP , 2004, Proceedings of 2004 International Symposium on Intelligent Signal Processing and Communication Systems, 2004. ISPACS 2004..
[28] Sally A. McKee,et al. Algorithmic foundations for a parallel vector access memory system , 2000, SPAA '00.
[29] Steven W. Hammond,et al. Architecture and Application: The Performance of the NEC SX-4 on the NCAR Benchmark Suite , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[30] Wonyong Sung,et al. An FPGA based SIMD processor with a vector memory unit , 2006, 2006 IEEE International Symposium on Circuits and Systems.
[31] Stamatis Vassiliadis,et al. Implementation and evaluation of the Complex Streamed Instruction set , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[32] R. Krishnamurthy,et al. A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core , 2002, 2002 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.02CH37302).
[33] Mateo Valero,et al. Command vector memory systems: high performance at low cost , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[34] Mateo Valero,et al. Three-dimensional memory vectorization for high bandwidth media memory systems , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[35] Michael R. Macedonia,et al. The GPU Enters Computing's Mainstream , 2003, Computer.
[36] David H. Bailey,et al. Vector Computer Memory Bank Contention , 1987, IEEE Transactions on Computers.
[37] Shreekant S. Thakkar,et al. Internet Streaming SIMD Extensions , 1999, Computer.
[38] André Seznec,et al. Interleaved parallel schemes: improving memory throughput on supercomputers , 1992, ISCA '92.