A methodology to evaluate memory architecture design tradeoffs for video signal processors

Develops a methodology for the design of the memory and the memory-processor communication network in video signal processors. The memory subsystem is the bottleneck of most video computing systems and its design requires evaluating tradeoffs between area, cycle time, and utilization. We emphasize the need to consider technological and circuit-level issues during the design of a system architecture, particularly video signal processing (VSP) systems, and present a systematic method whereby the organization of the memory architecture can be analyzed and its cycle-time approximated before a detailed design is undertaken. We show how variations in sizes and circuit configurations help determine the variations in delay of both memory and network, and how the delay curves, thus determined, can be used to design, compare, and choose from different memory-system architectures; we also describe a technique that can be used to identify the on-chip-off-chip boundary with respect to a hierarchical memory-system design for a memory-intensive VSP module. All of our results are validated via layout and simulation of prototype circuits in two different process technologies. Motion estimation and discrete cosine transform (DCT) being two of the most important tasks in video processing, we use the design of a motion estimator and that of a DCT unit as examples to illustrate the high-level issues in designing the memory architecture for a VSP module. The analysis presented for the motion estimator and the DCT unit can also be applied to other processing blocks belonging to the system.

[1]  Kyusun Choi,et al.  VLSI implementation of a 256*256 crossbar interconnection network , 1992, Proceedings Sixth International Parallel Processing Symposium.

[2]  Peter Pirsch,et al.  Array architectures for block matching algorithms , 1989 .

[3]  H. B. Bakoglu,et al.  Circuits, interconnections, and packaging for VLSI , 1990 .

[4]  Santanu Dutta,et al.  Asymptotic limits of video signal processing architectures , 1995, IEEE Trans. Circuits Syst. Video Technol..

[5]  S. Horiuchi,et al.  An 8Kx8 bit static MOS RAM fabricated by n-MOS/n-well CMOS technology , 1980, IEEE Journal of Solid-State Circuits.

[6]  Arun N. Netravali,et al.  Digital Pictures: Representation and Compression , 1988 .

[7]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[8]  P.J. Hynes,et al.  A programmable 1400 MOPS video signal processor , 1989, 1989 Proceedings of the IEEE Custom Integrated Circuits Conference.

[9]  T Koga,et al.  MOTION COMPENSATED INTER-FRAME CODING FOR VIDEO CONFERENCING , 1981 .

[10]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[11]  Borko Furht,et al.  Video and Image Processing in Multimedia Systems , 1995 .

[12]  Ming-Ting Sun,et al.  A family of vlsi designs for the motion compensation block-matching algorithm , 1989 .

[13]  Neil Weste,et al.  Principles of CMOS VLSI Design , 1985 .

[14]  Santanu Dutta,et al.  A flexible parallel architecture adapted to block-matching motion-estimation algorithms , 1996, IEEE Trans. Circuits Syst. Video Technol..

[15]  Peter A. Ruetz,et al.  A high-performance full-motion video compression chip set , 1992, IEEE Trans. Circuits Syst. Video Technol..