System-Level Data-Flow Transformation Exploration and Power-Area Trade-offs Demonstrated on Video Codecs

Application studies in the domain of image and video processing systems indicate that up to 80% of the power and area cost in customized architectures for such data-dominant processing is due to storage and transfers for multi-dimensional (M-D) data. This paper has two main contributions. First, as a crucial step to reduce this dominant cost, we propose an exploration subscript focused on data-flow transformations which address the system-level storage organization. This subscript fits within a complete high-level memory management methodology developed in the context of our ATOMIUM research activity. We will also indicate the potential for future design support in each of the stages of the subscript. Secondly, we will demonstrate the usefulness of the stages in this novel system exploration approach based on realistic test-vehicles, in particular crucial modules in a complex H.263 video decoder system for teleconferencing.

[1]  David B. Loveman,et al.  Program Improvement by Source-to-Source Transformation , 1977, J. ACM.

[2]  Hugo De Man,et al.  Loop transformation methodology for fixed-rate video, image and telecom processing applications , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[3]  Hugo De Man,et al.  A specification invariant technique for operation cost minimisation in flow-graphs , 1994, Proceedings of 7th International Symposium on High-Level Synthesis.

[4]  Hugo De Man,et al.  Global Communication and Memory Optimizing Transformations for Low Power Systems , 1994 .

[5]  Miodrag Potkonjak,et al.  Optimizing resource utilization using transformations , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[6]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[7]  Keshab K. Parhi,et al.  High-level algorithm and architecture transformations for DSP synthesis , 1995, J. VLSI Signal Process..

[8]  Jan M. Rabaey,et al.  Maximizing the throughput of high performance DSP applications using behavioral transformations , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[9]  Massoud Pedram,et al.  Power conscious CAD tools and methodologies: a perspective , 1995, Proc. IEEE.

[10]  Zachary J. Lemnios,et al.  Low-power electronics , 1994, IEEE Design & Test of Computers.

[11]  Y. Nakagome,et al.  Trends in low-power RAM circuit technologies , 1995 .

[12]  Miodrag Potkonjak,et al.  Optimizing power using transformations , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  H. De Man,et al.  Optimization of memory organization and hierarchy for decreased size and power in video and image processing systems , 1995, Records of the 1995 IEEE International Workshop on Memory Technology, Design and Testing.

[14]  Peter Pirsch,et al.  VLSI architectures for video compression-a survey , 1995, Proc. IEEE.

[15]  Paul Feautrier,et al.  Efficient Mapping of Interdependent Scans , 1996, Euro-Par, Vol. I.

[16]  H. De Man,et al.  The Exploitation Of Global Operations In Affine Space-time Mapping , 1992, Workshop on VLSI Signal Processing.

[17]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[18]  Wei Li,et al.  New trends in very low bitrate video coding , 1995, Proc. IEEE.

[19]  L. Nachtergaele,et al.  Low power storage exploration for H.263 video decoder , 1996, VLSI Signal Processing, IX.

[20]  Teresa H. Meng,et al.  Portable video-on-demand in wireless communication , 1995, Proc. IEEE.

[21]  H.J. De Man,et al.  Automating High Level Control F'low Transformations For Dsp Memory Management , 1992, Workshop on VLSI Signal Processing.

[22]  Richard I. Hartley,et al.  Optimizing pipelined networks of associative and commutative operators , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[23]  Miodrag Potkonjak,et al.  Efficient Substitution of Multiple Constant Multiplications by Shifts and Additions Using Iterative Pairwise Matching , 1994, 31st Design Automation Conference.

[24]  Hugo De Man,et al.  System-Level Memory Management for Weakly Parallel Image Processing , 1996, Euro-Par, Vol. II.

[25]  Michael J. Flynn,et al.  An area model for on-chip memories and its application , 1991 .

[26]  Francky Catthoor,et al.  VLSI Video - Image Signal Processing , 1993 .

[27]  R. Mehra,et al.  Exploiting locality for low-power design , 1996, Proceedings of Custom Integrated Circuits Conference.

[28]  David A. Padua,et al.  Advanced compiler optimizations for supercomputers , 1986, CACM.

[29]  Constantine D. Polychronopoulos Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design , 1988, IEEE Trans. Computers.

[30]  Keshab K. Parhi,et al.  Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.

[31]  Gerhard Fettweis,et al.  Algebraic recurrence transformations for massive parallelism , 1993 .

[32]  Mary Lou Soffa,et al.  An approach to ordering optimizing transformations , 1990, PPOPP '90.

[33]  Catherine H. Gebotys Low energy memory component design for cost-sensitive high performance embedded systems , 1996, Proceedings of Custom Integrated Circuits Conference.

[34]  Hugo De Man,et al.  A specification invariant technique for regularity improvement between flow-graph clusters , 1996, Proceedings ED&TC European Design and Test Conference.

[35]  Donald E. Thomas,et al.  Behavioral transformation for algorithmic level IC design , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..