Strategy for power-efficient design of parallel systems

Application studies in the areas of image- and video-processing indicate that between 50%-80% of the power cost in these systems is due to data storage and transfers. This is especially true for multiprocessor realizations because conventional parallelization methods ignore the power cost and focus only on performance. However, the power consumption also heavily depends on the way a system is parallelized. To reduce this dominant cost, we propose to address the system-level storage organization for the multidimensional signals as a first step in mapping these applications, before the parallelization or partitioning decisions (in particular, before the hardware/software (HW/SW) partitioning, which is traditionally done too early in the design trajectory). Our methodology is illustrated on a parallel quadtree-structured difference pulse-code modulation video codec.

[1]  Roland Rühl,et al.  Automatic parallelization of LINPACK routines on distributed memory parallel processors , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[2]  Ahmed Amine Jerraya,et al.  Interactive system-level partitioning with PARTIF , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[3]  Diederik Verkest,et al.  Co-Design of DSP Systems , 1996 .

[4]  Klaus Buchenrieder,et al.  HW/SW Co-Design with PRAMs Using CoDES , 1993, CHDL.

[5]  Peter Pirsch,et al.  A system level HW/SW partitioning and optimization tool , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[6]  Mani Srivastava,et al.  Rapid-prototyping of hardware and software in a unified framework , 1991, 1991 IEEE International Conference on Computer-Aided Design Digest of Technical Papers.

[7]  Donald E. Thomas,et al.  Multiple-process behavioral synthesis for mixed hardware-software systems , 1995 .

[8]  Jörg Henkel,et al.  Hardware-software cosynthesis for microcontrollers , 1993, IEEE Design & Test of Computers.

[9]  Hugo De Man,et al.  System level memory optimization for hardware-software co-design , 1997, CODES.

[10]  Hugo De Man,et al.  Memory Size Reduction Through Storage Order Optimization for Embedded Parallel Multimedia Applications , 1997, Parallel Comput..

[11]  Jianwen Zhu,et al.  Specification and Design of Embedded Systems , 1998, Informationstechnik Tech. Inform..

[12]  L. Nachtergaele,et al.  Low power storage exploration for H.263 video decoder , 1996, VLSI Signal Processing, IX.

[13]  Hugo De Man,et al.  Power exploration for data dominated video applications , 1996, ISLPED '96.

[14]  Keshab K. Parhi,et al.  Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding , 1991, IEEE Trans. Computers.

[15]  Gaetano Borriello,et al.  The Chinook hardware/software co-synthesis system , 1995 .

[16]  Edward A. Lee,et al.  A hardware-software codesign methodology for DSP applications , 1993, IEEE Design & Test of Computers.

[17]  Saman Amarasinghe,et al.  The suif compiler for scalable parallel machines , 1995 .

[18]  Thomas P. Barnwell,et al.  Cyclo-static multiprocessor scheduling for the optimal realization of shift-invariant flow graphs , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Constantine D. Polychronopoulos Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design , 1988, IEEE Trans. Computers.

[20]  Keshab K. Parhi,et al.  Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.

[21]  W.F.J. Verhaegh,et al.  Allocation of multiport memories for hierarchical data streams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[22]  Michael Stumm,et al.  Linear Loop Transformations in Optimising Compilers for Parallel Machines , 1995, Aust. Comput. J..

[23]  H. De Man,et al.  System level memory optimization for hardware-software co-design , 1997, Proceedings of 5th International Workshop on Hardware/Software Co Design. Codes/CASHE '97.

[24]  Rudolf Eigenmann,et al.  Automatic program parallelization , 1993, Proc. IEEE.

[25]  Hugo De Man,et al.  System-level transformations for low power data transfer and storage , 1998 .

[26]  Giovanni De Micheli,et al.  Hardware-software cosynthesis for digital systems , 1993, IEEE Design & Test of Computers.

[27]  Hugo De Man,et al.  System-Level Memory Management for Weakly Parallel Image Processing , 1996, Euro-Par, Vol. II.

[28]  Hugo De Man,et al.  Architecture-driven synthesis techniques for VLSI implementation of DSP algorithms , 1990, Proc. IEEE.