High-bandwidth Address Generation Unit
暂无分享,去创建一个
[1] Mateo Valero,et al. Vector architectures: past, present and future , 1998, ICS '98.
[2] Mateo Valero,et al. Command vector memory systems: high performance at low cost , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[3] M.H. Sunwoo,et al. Design of address generation unit for audio DSP , 2004, Proceedings of 2004 International Symposium on Intelligent Signal Processing and Communication Systems, 2004. ISPACS 2004..
[4] Michael R. Macedonia,et al. The GPU Enters Computing's Mainstream , 2003, Computer.
[5] Mateo Valero,et al. Exploiting instruction- and data-level parallelism , 1997, IEEE Micro.
[6] Mitsumasa Koyanagi,et al. A new multiport memory for high performance parallel processor system with shared memory , 1998, Proceedings of 1998 Asia and South Pacific Design Automation Conference.
[7] David Abramson,et al. Automated synthesis of interleaved memory systems for custom computing machines , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).
[8] Sally A. McKee,et al. Design of a parallel vector access unit for SDRAM memory systems , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).
[9] C. John Glossner,et al. Instruction set extensions for software defined radio on a multithreaded processor , 2005, CASES '05.
[10] Ram Krishnamurthy,et al. A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core , 2002, VLSIC 2002.
[11] André Seznec,et al. Interleaved Parallel Schemes , 1994, IEEE Trans. Parallel Distributed Syst..
[12] André Seznec,et al. Interleaved parallel schemes: improving memory throughput on supercomputers , 1992, ISCA '92.
[13] Mateo Valero,et al. Three-dimensional memory vectorization for high bandwidth media memory systems , 2002, MICRO.
[14] Jong Won Park. Multiaccess Memory System for Attached SIMD Computer , 2004, IEEE Trans. Computers.
[15] Stamatis Vassiliadis,et al. Multimedia rectangularly addressable memory , 2006, IEEE Transactions on Multimedia.
[16] Richard M. Russell,et al. The CRAY-1 computer system , 1978, CACM.
[17] Stamatis Vassiliadis,et al. Reconfigurable Fixed Point Dense and Sparse Matrix-Vector Multiply/Add Unit , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).
[18] Paul Budnik,et al. The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.
[19] Duncan H. Lawrie,et al. The Prime Memory System for Array Access , 1982, IEEE Transactions on Computers.
[20] Jong Won Park. An Efficient Buffer Memory System for Subarray Access , 2001, IEEE Trans. Parallel Distributed Syst..
[21] Stamatis Vassiliadis,et al. The MOLEN polymorphic processor , 2004, IEEE Transactions on Computers.
[22] R.K. Krishnamurthy,et al. A 9-GHz 65-nm Intel® Pentium 4 Processor Integer Execution Unit , 2006, IEEE Journal of Solid-State Circuits.
[23] David T. Harper,et al. Conflict-Free Vector Access Using a Dynamic Storage Scheme , 1991, IEEE Trans. Computers.
[24] Sally A. McKee,et al. Algorithmic foundations for a parallel vector access memory system , 2000, SPAA '00.
[25] Steven W. Hammond,et al. Architecture and Application: The Performance of the NEC SX-4 on the NCAR Benchmark Suite , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[26] David J. Kuck,et al. The Burroughs Scientific Processor (BSP) , 1982, IEEE Transactions on Computers.
[27] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[28] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..
[29] David T. Harper,et al. Increased Memory Performance During Vector Accesses Through the use of Linear Address Transformations , 1992, IEEE Trans. Computers.
[30] Gurindar S. Sohi. High-Bandwidth Interleaved Memories for Vector Processors-A Simulation Study , 1993, IEEE Trans. Computers.
[31] Wonyong Sung,et al. An FPGA based SIMD processor with a vector memory unit , 2006, 2006 IEEE International Symposium on Circuits and Systems.
[32] David H. Bailey,et al. Vector Computer Memory Bank Contention , 1987, IEEE Transactions on Computers.
[33] Shreekant S. Thakkar,et al. Internet Streaming SIMD Extensions , 1999, Computer.
[34] David T. Harper,et al. Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems , 1991, IEEE Trans. Parallel Distributed Syst..
[35] Stamatis Vassiliadis,et al. Implementation and evaluation of the Complex Streamed Instruction set , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[36] R. Krishnamurthy,et al. A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core , 2002, 2002 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.02CH37302).
[37] Kai Hwang,et al. Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.
[38] Stamatis Vassiliadis,et al. Reconfigurable Multiple Operation Array , 2005, SAMOS.
[39] Eduard Ayguadé,et al. Conflict-Free Access for Streams in Multimodule Memories , 1995, IEEE Trans. Computers.