Hierarchical power management for adaptive tightly-coupled processor arrays

We present a self-adaptive hierarchical power management technique for massively parallel processor architectures, supporting a new resource-aware parallel computing paradigm called invasive computing. Here, an application can dynamically claim, execute, and release the resources in three phases: resource acquisition (invade), program loading/configuration and execution (infect), and release (retreat). Resource invasion is governed by dedicated decentralized hardware controllers, called invasion controllers (ictrls), which are integrated into each processing element (PE). Several invasion strategies for claiming linearly connected or rectangular regions of processing resources are implemented. The key idea is to exploit the decentralized resource management inherent to invasive computing for power savings by enabling applications themselves to control the power for processing resources and invasion controllers using a hierarchical power-gating approach. We propose analytical models for estimating various components of energy consumption for faster design space exploration and compare them with the results obtained from a cycle-accurate C++ simulator of the processor array. In order to find optimal design trade-offs, various parameters like (a) energy consumption, (b) hardware cost, and (c) timing overheads are compared for different sizes of power domains. Experimental results show significant energy savings (up to 73%) for selected characteristical algorithms and different resource utilizations. In addition, we demonstrate the accuracy of our proposed analytical model. Here, estimation errors less than 3.6% can be reported.

[1]  Malgorzata Marek-Sadowska,et al.  Benefits and costs of power-gating technique , 2005, 2005 International Conference on Computer Design.

[2]  Jürgen Becker,et al.  Multiprocessor System-on-Chip - Hardware Design and Tool Integration , 2011, Multiprocessor System-on-Chip.

[3]  Houman Homayoun,et al.  On leakage power optimization in clock tree networks for ASICs and general-purpose processors , 2011, Sustain. Comput. Informatics Syst..

[4]  Balaram Sinharoy,et al.  POWER7: IBM's next generation server processor , 2010, 2009 IEEE Hot Chips 21 Symposium (HCS).

[5]  Jürgen Teich,et al.  A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.

[6]  Mike Butts,et al.  Synchronization through Communication in a Massively Parallel Processor Array , 2007, IEEE Micro.

[7]  Jürgen Teich,et al.  Invasive Algorithms and Architectures Invasive Algorithmen und Architekturen , 2008, it Inf. Technol..

[8]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[9]  Jürgen Teich,et al.  Power-Efficient Reconfiguration Control in Coarse-Grained Dynamically Reconfigurable Architectures , 2008, PATMOS.

[10]  Jürgen Teich,et al.  Scalable Many-Domain Power Gating in Coarse-Grained Reconfigurable Processor Arrays , 2011, IEEE Embedded Systems Letters.

[11]  Simha Sethumadhavan,et al.  Distributed Microarchitectural Protocols in the TRIPS Prototype Processor , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[12]  Massoud Pedram,et al.  Clock-gating and its application to low power design of sequential circuits , 2000 .

[13]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2004, The Journal of Supercomputing.

[14]  Olivier Temam,et al.  CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[15]  TeichJürgen,et al.  Hierarchical power management for adaptive tightly-coupled processor arrays , 2013 .

[16]  Hai Zhou,et al.  Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems , 2010 .

[17]  Takashi Nishimura,et al.  Leakage power reduction for coarse grained dynamically reconfigurable processor arrays with fine grained Power Gating technique , 2008, 2008 International Conference on Field-Programmable Technology.

[18]  Jürgen Teich,et al.  Resource-aware programming and simulation of MPSoC architectures through extension of X10 , 2011, SCOPES.

[19]  Jürgen Teich,et al.  Efficient event-driven simulation of parallel processor architectures , 2007, SCOPES '07.

[20]  Bjorn De Sutter,et al.  Architecture Enhancements for the ADRES Coarse-Grained Reconfigurable Array , 2008, HiPEAC.

[21]  James Kao,et al.  Subthreshold leakage modeling and reduction techniques , 2002, ICCAD 2002.

[22]  Gerald H. Hilderink,et al.  Parallel Processing — the picoChip way! , 2003 .

[23]  Jürgen Teich,et al.  Decentralized dynamic resource management support for massively parallel processor arrays , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.