Constraint-driven frequency scaling in a Coarse Grain Reconfigurable Array

This paper introduces a self-optimizing processor/coprocessor model supported by a feedback control system to achieve power efficiency. The software on the processor receives high-level performance constraints (i.e., real-time limits) as goal from the user and in return controls the clock speed of the coprocessor and memories, ensuring the performance constraints are met while minimizing power dissipation. The system is prototyped on a Stratix-V Field Programmable Gate Array device. The self-optimization feature requires less than 0.5% of the overall logic resources and provides a 33% reduction in average dynamic power dissipation when the control system activates for a proof-of-concept test case derived from Fast Fourier Transform processing at the IEEE-802.11n demodulator.

[1]  Jari Nurmi,et al.  General-Purpose Embedded Processor Cores – The COFFEE RISC Example , 2007 .

[2]  C. C. Bissell General Characteristics of Feedback , 1988 .

[3]  Jari Nurmi,et al.  A coarse-grain reconfigurable architecture for multimedia applications featuring subword computation capabilities , 2008, Journal of Real-Time Image Processing.

[4]  Jari Nurmi,et al.  CREMA: A coarse-grain reconfigurable array with mapping adaptiveness , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[5]  Jari Nurmi,et al.  Improving Reconfigurable Hardware Energy Efficiency and Robustness via DVFS-Scaled Homogeneous MP-SoC , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[6]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[7]  Steven Swanson,et al.  Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.

[8]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[9]  Jari Nurmi,et al.  A dedicated DMA logic addressing a time multiplexed memory to reduce the effects of the system bus bottleneck , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[10]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2003, The Journal of Supercomputing.

[11]  George Kurian,et al.  Self-aware computing in the Angstrom processor , 2012, DAC Design Automation Conference 2012.

[12]  Rudy Lauwereins,et al.  ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix , 2003, FPL.

[13]  Jari Nurmi,et al.  Designing Fast Fourier Transform Accelerators for Orthogonal Frequency-Division Multiplexing Systems , 2012, J. Signal Process. Syst..