Energy‐aware strategies for task‐parallel sparse linear system solvers

We present several energy‐aware strategies to improve the energy efficiency of a task‐parallel preconditioned Conjugate Gradient (PCG) iterative solver on a Haswell‐EP Intel Xeon. These techniques leverage the power‐saving states of the processor, promoting the hardware into a more energy‐efficient C‐state and modifying the CPU frequency (P‐states of the processors) of some operations of the PCG. We demonstrate that the application of these strategies during the main operations of the iterative solver can reduce its energy consumption considerably, especially for memory‐bound computations.

[1]  Albert Y. Zomaya,et al.  Energy-aware parallel task scheduling in a cluster , 2013, Future Gener. Comput. Syst..

[2]  Yousef Saad,et al.  Multilevel Preconditioners Constructed From Inverse-Based ILUs , 2005, SIAM J. Sci. Comput..

[3]  Enrique S. Quintana-Ortí,et al.  Exploiting thread-level parallelism in the iterative solution of sparse linear systems , 2011, Parallel Comput..

[4]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[5]  Enrique S. Quintana-Ortí,et al.  Exploiting Task-Parallelism in Message-Passing Sparse Linear System Solvers Using OmpSs , 2016, Euro-Par.

[6]  Enrique S. Quintana-Ortí,et al.  Exploiting task and data parallelism in ILUPACK's preconditioned CG solver on NUMA architectures and many-core accelerators , 2016, Parallel Comput..

[7]  Rafael Mayo,et al.  Assessing the impact of the CPU power-saving modes on the task-parallel solution of sparse linear systems , 2014, Cluster Computing.

[8]  Lizhe Wang,et al.  Hierarchical genetic-based grid scheduling with energy optimization , 2012, Cluster Computing.

[9]  Efraim Rotem,et al.  Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake , 2017, IEEE Micro.

[10]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[11]  Enrique S. Quintana-Ortí,et al.  Leveraging Data-Parallelism in ILUPACK using Graphics Processors , 2014, 2014 IEEE 13th International Symposium on Parallel and Distributed Computing.

[12]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[13]  Samuel H. Fuller,et al.  The Future of Computing Performance: Game Over or Next Level? , 2014 .

[14]  Babak Falsafi,et al.  The HiPEAC Vision , 2010 .

[15]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[16]  Enrique S. Quintana-Ortí,et al.  Energy-efficient execution of dense linear algebra algorithms on multi-core processors , 2012, Cluster Computing.

[17]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[18]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.