Configurable parallel memory architecture for multimedia computers

This paper presents a novel parallel memory architecture for multimedia computers. Applying a configurable or programmable addressing circuitry capable of parallel memory accesses, the memory management of multimedia applications can be enhanced. Necessary computer architecture changes to virtual address representation, paging, virtual memory, address computation circuitry and data permutation are discussed. These changes allow the memory to be partitioned for different access functions. In addition, the same memory area can be accessed by multiple access patterns. Therefore, a general-purpose computing system that is capable of exploiting the repeating memory access patterns in its applications can be built. Performance of the configurable parallel memory architecture (CPMA) is analyzed in the case of a selection of algorithms from a video encoder. These motion estimation algorithms and zigzag scanning benefit from the multiple memory access functions, which is apparent from the comparisons to the traditional sequential memory accesses.

[1]  Charles Retter,et al.  Computer Architecture: A Designer''s Text Based on a Generic RISC, McGraw-Hill Computer Science Ser , 1994 .

[2]  Jong Won Park,et al.  An Efficient Memory System for the SIMD Construction of a Gaussian Pyramid , 1996, IEEE Trans. Parallel Distributed Syst..

[3]  Brad Hutchings,et al.  The flexibility of configurable computing , 1998 .

[4]  Katherine Yelick,et al.  A Case for Intelligent RAM: IRAM , 1997 .

[5]  Peter J. Denning Virtual Memory , 1996, ACM Comput. Surv..

[6]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[7]  Christoforos E. Kozyrakis,et al.  A case for intelligent RAM , 1997, IEEE Micro.

[8]  David T. Harper A Multiaccess Frame Buffer Architecture , 1994, IEEE Trans. Computers.

[9]  Steven A. Przybylski,et al.  Cache and memory hierarchy design: a performance-directed approach , 1990 .

[10]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[11]  Chaitali Chakrabarti,et al.  Architectures for hierarchical and other block matching algorithms , 1995, IEEE Trans. Circuits Syst. Video Technol..

[12]  J. Michael O'Connor,et al.  picoJava-I: the Java virtual machine in hardware , 1997, IEEE Micro.

[13]  Jan van Leeuwen,et al.  On Linear Skewing Schemes and d-Ordered Vectors , 1987, IEEE Transactions on Computers.

[14]  Peter Gillingham,et al.  Two High-Bandwidth Memory Bus Structures , 1999, IEEE Des. Test Comput..

[15]  Bin Wei Comments on "A Multiaccess Frame Buffer Architecture" , 1996, IEEE Trans. Computers.

[16]  Haibo Li,et al.  Image sequence coding at very low bit rates: a review , 1994, IEEE Trans. Image Process..

[17]  P. Batra,et al.  A process independent 800 MB/s DRAM bytewide interface featuring command interleaving and concurrent memory operation , 1998, 1998 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, ISSCC. First Edition (Cat. No.98CH36156).

[18]  Richard Crisp,et al.  Direct RAMbus technology: the new main memory standard , 1997, IEEE Micro.

[19]  Arturo A. Rodriguez,et al.  Evaluating Video Codecs , 1994, IEEE MultiMedia.

[20]  B. J. Sheu,et al.  High-speed memory architectures for multimedia applications , 1997 .

[21]  Alan Norton,et al.  A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access , 1987, ICPP.

[22]  David L. Black,et al.  Machine-independent virtual memory management for paged uniprocessor and multiprocessor architectures , 1987, ASPLOS 1987.

[23]  Eduard Ayguadé,et al.  Conflict-Free Access for Streams in Multimodule Memories , 1995, IEEE Trans. Computers.

[24]  Konstantinos Konstantinides,et al.  Image and Video Compression Standards: Algorithms and Architectures , 1997 .

[25]  Trevor N. Mudge,et al.  Virtual memory in contemporary microprocessors , 1998, IEEE Micro.

[26]  Bede Liu,et al.  New fast algorithms for the estimation of block motion vectors , 1993, IEEE Trans. Circuits Syst. Video Technol..

[27]  Paul Budnik,et al.  The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.

[28]  Duncan H. Lawrie,et al.  The Prime Memory System for Array Access , 1982, IEEE Transactions on Computers.

[29]  Ashoke Deb Multiskewing-A Novel Technique for Optimal Parallel Memory Access , 1996, IEEE Trans. Parallel Distributed Syst..

[30]  David T. Harper,et al.  Conflict-Free Vector Access Using a Dynamic Storage Scheme , 1991, IEEE Trans. Computers.

[31]  De-Lei Lee Architecture of an Array Processor Using a Nonlinear Skewing Scheme , 1992, IEEE Trans. Computers.

[32]  Trevor N. Mudge,et al.  Virtual Memory: Issues of Implementation , 1998, Computer.

[33]  R. Sarnath,et al.  Proceedings of the International Conference on Parallel Processing , 1992 .

[34]  Christoforos E. Kozyrakis,et al.  A New Direction for Computer Architecture Research , 1998, Computer.

[35]  Subramanian S. Iyer,et al.  Embedded DRAM technology: opportunities and challenges , 1999 .

[36]  Andrew Wolfe,et al.  A methodology to evaluate memory architecture design tradeoffs for video signal processors , 1998, IEEE Trans. Circuits Syst. Video Technol..

[37]  Howard Jay Siegel,et al.  Many SIMD interconnection networks have been proposed . To put the different approaches into perspective , this analysis compares a number of single-and multistage networks , 2022 .

[38]  Michael Gössel,et al.  Memory Architecture and Parallel Access , 1994 .

[39]  David T. Harper,et al.  Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems , 1991, IEEE Trans. Parallel Distributed Syst..

[40]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[41]  David T. Harper,et al.  Increased Memory Performance During Vector Accesses Through the use of Linear Address Transformations , 1992, IEEE Trans. Computers.