Optimization of memory organization and hierarchy for decreased size and power in video and image processing systems

Video and image processing applications deal with large amounts of data which have to be stored and transferred. As the initial system specification describing these data manipulations heavily influences the final memory organization and hierarchy, there is a clear need for exploration support. We believe that the emphasis should lie on fast but accurate estimation and on the high-level steering of the involved system transformations. In this paper, a system exploration environment called ATOMIUM, is presented that supports these requirements. To illustrate the effectiveness of our approach, two realistic demonstrators are worked out and design results are described.

[1]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[2]  Hugo De Man,et al.  Loop transformation methodology for fixed-rate video, image and telecom processing applications , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[3]  H. De Man,et al.  Global communication and memory optimizing transformations for low power signal processing systems , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[4]  Per Stenström,et al.  A Survey of Cache Coherence Schemes for Multiprocessors , 1990, Computer.

[5]  H. De Man,et al.  SynGuide: An environment for doing interactive correctness preserving transformations , 1993, Proceedings of IEEE Workshop on VLSI Signal Processing.

[6]  Noriyuki Suzuki,et al.  A 6-ns 1-Mb CMOS SRAM with latched sense amplifier , 1993 .

[7]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[8]  Fadi J. Kurdahi,et al.  REAL: A Program for REgister ALlocation , 1987, 24th ACM/IEEE Design Automation Conference.

[9]  Hugo De Man,et al.  Control flow optimization for fast system simulation and storage minimization [real-time multidimens , 1994 .

[10]  Hugo De Man,et al.  Mapping real-time motion estimation type algorithms to memory efficient, programmable multi-processor architectures , 1995, Microprocess. Microprogramming.

[11]  Arun K. Majumdar,et al.  Allocation of multiport memories in data path synthesis , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[12]  Daniel P. Siewiorek,et al.  Automated Synthesis of Data Paths in Digital Systems , 1986, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  Leon Stok,et al.  Foreground memory management in data path synthesis , 1992, Int. J. Circuit Theory Appl..

[14]  W.F.J. Verhaegh,et al.  Allocation of multiport memories for hierarchical data streams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[15]  F. Catthoor,et al.  A Memory Efficient, Programmable Multi-Processor Architecture for Real-Time Motion Estimation Type Algorithms , 1995 .

[17]  H. De Man,et al.  Address equation multiplexing for real-time signal processing applications , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[18]  Peter Pirsch,et al.  Array architectures for block matching algorithms , 1989 .

[19]  Joos Vandewalle,et al.  An efficient microcode compiler for application specific DSP processors , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[20]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[21]  P. Stenstrom A survey of cache coherence schemes for multiprocessors , 1990, Computer.

[22]  H. De Man,et al.  Dataflow-driven Memory Allocation For Multi-dimensional Signal Processing Systems , 1994, IEEE/ACM International Conference on Computer-Aided Design.

[23]  Lothar Thiele,et al.  On the design of piecewise regular processor arrays , 1989, IEEE International Symposium on Circuits and Systems,.

[24]  Imtiaz Ahmad,et al.  Post-processor for data path synthesis using multiport memories , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[25]  Viraphol Chaiyakul,et al.  An algorithm for array variable clustering , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[26]  H. De Man,et al.  Verification of loop transformations for real time signal processing applications , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.