Integrated link/CPU voltage scaling for reducing energy consumption of parallel sparse matrix applications

Reducing power consumption is quickly becoming a first-class optimization metric for many high-performance parallel computing platforms. One of the techniques employed by many prior proposals along this direction is voltage scaling and past research used it on different components such as networks, CPUs, and memories. In contrast to most of the existent efforts on voltage scaling that target a single component (CPU, network or memory components), this paper proposes and experimentally evaluates a voltage/frequency scaling algorithm that considers CPU and communication links in a mesh network at the same time. More specifically, it scales voltages/frequencies of both CPUs in the network and the communication links among them in a coordinated fashion (instead of one after another) such that energy savings are maximized without impacting execution time. Our experiments with several tree-based sparse matrix computations reveal that the proposed integrated voltage scaling approach is very effective in practice and brings 13% and 17% energy savings over the pure CPU and pure communication link voltage scaling schemes, respectively. The results also show that our savings are consistent with the different network sizes and different sets of voltage/frequency levels

[1]  Padma Raghavan,et al.  Towards a Scalable Hybrid Sparse Solver , 2000, Concurrency Practice and Experience.

[2]  Li-Shiuan Peh,et al.  Design-space exploration of power-aware on/off interconnection networks , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[3]  Mahmut T. Kandemir,et al.  Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[4]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[5]  Giovanni De Micheli,et al.  An adaptive low-power transmission scheme for on-chip networks , 2002, 15th International Symposium on System Synthesis, 2002..

[6]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[7]  Jaeha Kim,et al.  Adaptive supply serial links with sub-1 V operation and per-pin clock recovery , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[8]  Vipin Kumar,et al.  PSPASES: An Efficient and Scalable Parallel Sparse Direct Solver , 1999, PPSC.

[9]  James Demmel,et al.  A Supernodal Approach to Sparse Partial Pivoting , 1999, SIAM J. Matrix Anal. Appl..

[10]  Dongkun Shin,et al.  Power-aware communication optimization for networks-on-chips with voltage scalable links , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[11]  Padma Raghavan,et al.  A latency tolerant hybrid sparse solver using incomplete Cholesky factorization , 2003, Numer. Linear Algebra Appl..

[12]  Mahmut T. Kandemir,et al.  Energy optimization techniques in cluster interconnects , 2003, ISLPED '03.

[13]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[14]  P. Krishnan,et al.  Thwarting the Power-Hungry Disk , 1994, USENIX Winter.

[15]  Li Shang,et al.  Dynamic voltage scaling with links for power optimization of interconnection networks , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[16]  Scott Shenker,et al.  Scheduling for reduced CPU energy , 1994, OSDI '94.

[17]  Niraj K. Jha,et al.  Simultaneous Dynamic Voltage Scaling of Processors and Communication Links in Real-Time Distributed Embedded Systems , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[19]  Li-Shiuan Peh,et al.  Leakage power modeling and optimization in interconnection networks , 2003, ISLPED '03.

[20]  Padma Raghavan,et al.  Towards a scalable hybrid sparse solver , 2000 .

[21]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[22]  Michael T. Heath,et al.  Parallel Algorithms for Sparse Linear Systems , 1991, SIAM Rev..

[23]  E. N. Elnozahy,et al.  Energy Conservation Policies for Web Servers , 2003, USENIX Symposium on Internet Technologies and Systems.

[24]  Prabhakar Raghavan,et al.  DSCPACK: Domain-Separator Codes for the parallel solution of sparse linear systems , 2002 .

[25]  Vipin Kumar,et al.  Performance and Scalability of Preconditioned Conjugate Gradient Methods on the CM-5 , 1993, SIAM Conference on Parallel Processing for Scientific Computing.

[26]  Mahmut T. Kandemir,et al.  DRPM: dynamic speed control for power management in server class disks , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[27]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .