A 2D Addressing Mode for Multimedia Applications

This paper discusses architectural solutions that deal with the high data throughput and the high computational power - two crucial performance requirements of MPEG standards. To increase the data throughput, we define a new data storage facility with a specific data organization and a new addressing mode. More specifically, we introduce an addressing function and refer to it as two-dimensional block addressing. Furthermore, we propose such an addressing approach, as an architectural feature and we believe it has useful properties that may position it as a basic addressing mode in future multimedia architectures. In addition, we propose an instruction set extension, utilizing the advantages of this addressing mode, as means of improving the computational power of a general-purpose super-scalar processor. To illustrate this concept, we have implemented a new instruction "ACcepted Quality" as a dedicated systolic structure. This instruction supports the corresponding function "ACQ" as defined in the Verification Model of MPEG-4. Its FPGA realization suggests 62 ns operating latency. Utilizing this result, we have made performance evaluations with a benchmark software (MPEG-4 shape encoder) using a cycle-accurate simulator. The simulation results indicate that the performance is increased by up to 10%. The introduced approach can be utilized by data encoding tools, which are based on block division of data. These tools are an essential part of many recent and up coming visual data compression standards like MPEG-4.

[1]  Hugo De Man,et al.  Low Power Memory Storage and Transfer Organization for the MPEG-4 Full Pel Motion Estimation on a Multimedia Processor , 1999, IEEE Trans. Multim..

[2]  Peter M. Kogge,et al.  The Architecture of Pipelined Computers , 1981 .

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  Peter Pirsch,et al.  Memory organization of a single-chip video signal processing system with embedded DRAM , 1999, Proceedings Ninth Great Lakes Symposium on VLSI.

[5]  Santanu Dutta,et al.  A flexible parallel architecture adapted to block-matching motion-estimation algorithms , 1996, IEEE Trans. Circuits Syst. Video Technol..

[6]  Paul Budnik,et al.  The Organization and Use of Parallel Memories , 1971, IEEE Transactions on Computers.

[7]  Peter Pirsch,et al.  Instruction Set Extensions for MPEG-4 Video , 1999, J. VLSI Signal Process..

[8]  Frederick P. Brooks,et al.  Computer architecture - concepts and evolution , 1997 .

[9]  F. Catthoor,et al.  Combining background memory management and regular array co-partitioning, illustrated on a full motion estimation kernel , 2000, VLSI Design 2000. Wireless and Digital Imaging in the Millennium. Proceedings of 13th International Conference on VLSI Design.

[10]  David C. van Voorhis,et al.  Memory Systems for Image Processing , 1978, IEEE Transactions on Computers.

[11]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[12]  Stamatis Vassiliadis,et al.  The MOLEN rho-mu-Coded Processor , 2001, FPL.

[13]  Peter Pirsch,et al.  The M-PIRE MPEG-4 codec DSP and its macroblock engine , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[14]  Jong Won Park,et al.  An Efficient Memory System for the SIMD Construction of a Gaussian Pyramid , 1996, IEEE Trans. Parallel Distributed Syst..

[15]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[16]  Peter Pirsch,et al.  Implementation of a multiprocessor system with distributed embedded DRAM on a large area integrated circuit , 2000, Proceedings IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[17]  Jong Won Park An Efficient Buffer Memory System for Subarray Access , 2001, IEEE Trans. Parallel Distributed Syst..