A Specification Refinement Methodology for Power Efficient Partitioning of Data-Dominated Algorithms Within Performance Constraints

A specification refinement methodology for the power efficient partitioning of real-time data-dominated algorithms is presented. The main idea of the proposed methodology is the reorganization with respect to data transfer and storage of the initial description of the target algorithm before conventional partitioning. This is achieved through the application of data transfer and storage optimizing high-level code transformations to the initial description of the target algorithm. These transformations basically align the data production and consumption between the different procedures of the initial specification thus reducing the memory size requirements of the system's realizations especially those in the interfaces between different processors. In this way the data transfer and storage related power consumption which forms an important part of the total power budget of a data dominated system is significantly reduced. Performance issues are explicitly taken into account during the application of the data transfer and storage high-level transformations. The proposed methodology can be applied both in a parallel (programmable) processor context and also in heterogeneous hardware-software architectures. The proposed methodology can be also used for the power efficient implementation of data dominated algorithms on architectures based on programmable cores and application specific memory hierarchies. Experimental results from real life applications prove the impact of the proposed methodology.

[1]  H. De Man,et al.  System level memory optimization for hardware-software co-design , 1997, Proceedings of 5th International Workshop on Hardware/Software Co Design. Codes/CASHE '97.

[2]  Alberto L. Sangiovanni-Vincentelli,et al.  Automatic synthesis of interfaces between incompatible protocols , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[3]  Xiaobo Sharon Hu,et al.  Hardware-Software Partitioning for Real-Time Embedded Systems , 1997, Des. Autom. Embed. Syst..

[4]  Edward A. Lee,et al.  The Extended Partitioning Problem: Hardware/Software Mapping, Scheduling, and Implementation-bin Selection , 1997, Des. Autom. Embed. Syst..

[5]  Monica S. Lam,et al.  Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..

[6]  Miodrag Potkonjak,et al.  System-level synthesis of low-power hard real-time systems , 1997, DAC.

[7]  Jan M. Rabaey,et al.  A partitioning scheme for optimizing interconnect power , 1997, IEEE J. Solid State Circuits.

[8]  H. De Man,et al.  Power exploration for data dominated video applications , 1996, Proceedings of 1996 International Symposium on Low Power Electronics and Design.

[9]  Rajesh K. Gupta,et al.  Data-flow assisted behavioral partitioning for embedded systems , 1997, DAC.

[10]  P. A. Subrahmanyam,et al.  Hardware/software partitioning for multifunction systems , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[11]  Ahmed Amine Jerraya,et al.  Protocol selection and interface generation for HW-SW codesign , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[12]  Hugo De Man,et al.  System-Level Memory Management for Weakly Parallel Image Processing , 1996, Euro-Par, Vol. II.

[13]  Frank Vahid,et al.  Specification and Design of Embedded Hardware-Software Systems , 1995, IEEE Des. Test Comput..

[14]  Keshab K. Parhi,et al.  Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding , 1991, IEEE Trans. Computers.

[15]  Teresa H. Meng,et al.  Portable video-on-demand in wireless communication , 1995, Proc. IEEE.

[16]  Peter Pirsch,et al.  A system level HW/SW partitioning and optimization tool , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[17]  Hugo De Man,et al.  System-Level Power Optimization of Video Codecs on Embedded Cores: A Systematic Approach , 1998, J. VLSI Signal Process..

[18]  Paul M. Chau,et al.  Rapid prototyping methodology for multiprocessor implementation of digital signal processing systems , 1995, J. VLSI Signal Process..

[19]  Stephen A. Edwards,et al.  Design of embedded systems: formal models, validation, and synthesis , 1997, Proc. IEEE.

[20]  Paul E. Landman,et al.  Low-power architectural design methodologies , 1995 .

[21]  Jean Luc Philippe,et al.  A formal technique for hardware interface design , 1998 .

[22]  Constantine D. Polychronopoulos Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design , 1988, IEEE Trans. Computers.

[23]  Keshab K. Parhi,et al.  Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.

[24]  Hugo De Man,et al.  Architecture-driven synthesis techniques for VLSI implementation of DSP algorithms , 1990, Proc. IEEE.

[25]  Francky Catthoor,et al.  Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .

[26]  Jörg Henkel,et al.  A hardware/software partitioner using a dynamically determined granularity , 1997, DAC.

[27]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[28]  H. De Man,et al.  System-level data-flow transformations for power reduction in image and video processing , 1996, Proceedings of Third International Conference on Electronics, Circuits, and Systems.

[29]  Noriyuki Suzuki,et al.  A 6-ns 1-Mb CMOS SRAM with latched sense amplifier , 1993 .

[30]  Luciano Lavagno,et al.  Fast hardware/software co-simulation for virtual prototyping and trade-off analysis , 1997, DAC.

[31]  W.F.J. Verhaegh,et al.  Allocation of multiport memories for hierarchical data streams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[32]  Jochen A. G. Jess,et al.  Stream communication between real-time tasks in a high-performance multiprocessor , 1998, Proceedings Design, Automation and Test in Europe.

[33]  Rudolf Eigenmann,et al.  Automatic program parallelization , 1993, Proc. IEEE.

[34]  Massoud Pedram,et al.  Low power design methodologies , 1996 .

[35]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[36]  MalikSharad,et al.  Power analysis of embedded software , 1994 .

[37]  Hugo De Man,et al.  Memory Size Reduction Through Storage Order Optimization for Embedded Parallel Multimedia Applications , 1997, Parallel Comput..

[38]  Gaetano Borriello,et al.  Dynamic communication models in embedded system co-simulation , 1997, DAC.