DVFS-control techniques for dense linear algebra operations on multi-core processors

This paper analyzes the impact on power consumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi-core processors. The strategies considered here, prototyped as the Slack Reduction Algorithm (SRA) and the Race-to-Idle Algorithm (RIA), adjust the operation frequency of the cores during execution of a collection of tasks (in which many dense linear algebra algorithms can be decomposed) with a very different approach to save energy. A power-aware simulator, in charge of scheduling the execution of tasks to processor cores, is employed to evaluate the performance benefits of these power-control policies for two reference algorithms for the LU factorization, a key operation for the solution of linear systems of equations.

[1]  Robert A. van de Geijn,et al.  Updating an LU Factorization with Pivoting , 2008, TOMS.

[2]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[3]  Babak Falsafi,et al.  The HiPEAC Vision , 2010 .

[4]  Mohamed Abid,et al.  An efficient list scheduling algorithm for time placement problem , 2007, Comput. Electr. Eng..

[5]  Jack Dongarra,et al.  LAPACK Users' Guide, 3rd ed. , 1999 .

[6]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[7]  Kirankumar Palli,et al.  Scheduling DAGs For Minimum Finish Time and Power Consumption on Heterogeneous Processors , 2005 .

[8]  Ralf Gruber,et al.  One Joule per GFlop for BLAS2 Now , 2010 .

[9]  Gene H. Golub,et al.  Matrix computations , 1983 .

[10]  Rongheng Li,et al.  List scheduling for jobs with arbitrary release times and similar lengths , 2007, J. Sched..

[11]  J. B. Ritter,et al.  The Critical-Path Method , 1965 .

[12]  Wu-chun Feng,et al.  A Feasibility Analysis of Power Awareness in Commodity-Based High-Performance Clusters , 2005, 2005 IEEE International Conference on Cluster Computing.

[13]  Thomas Ludwig,et al.  Editorial for the second international conference on energy-aware high performance computing , 2011, Computer Science - Research and Development.

[14]  Ishfaq Ahmad,et al.  Stretch and compress based re-scheduling techniques for minimizing the execution times of DAGs on multi-core processors under energy constraints , 2010, International Conference on Green Computing.

[15]  Susanne Albers,et al.  Energy-efficient algorithms , 2010, Commun. ACM.

[16]  Robert A. van de Geijn,et al.  Programming matrix algorithms-by-blocks for thread-level parallelism , 2009, TOMS.

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[19]  Thomas Ludwig Editorial for the First International Conference on Energy-Aware High Performance Computing , 2010, Computer Science - Research and Development.

[20]  Rong Ge,et al.  Green Supercomputing Comes of Age , 2008, IT Professional.

[21]  Enrique S. Quintana-Ortí,et al.  Improving power efficiency of dense linear algebra algorithms on multi-core processors via slack control , 2011, 2011 International Conference on High Performance Computing & Simulation.

[22]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[23]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .