Power profiling of Cholesky and QR factorizations on distributed memory systems

[1]  Jack J. Dongarra,et al.  A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a Two-Stage Bidiagonal Reduction , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[2]  Jack J. Dongarra,et al.  Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency , 2012, Computer Science - Research and Development.

[3]  Jack J. Dongarra,et al.  Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[4]  Jean-Marc Pierson,et al.  Characterizing Applications from Power Consumption: A Case Study for HPC Benchmarks , 2011, ICT-GLOW.

[5]  Thomas Hérault,et al.  Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[6]  Thomas Hérault,et al.  DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[7]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[8]  A Min Tjoa,et al.  Information and Communication on Technology for the Fight against Global Warming - First International Conference, ICT-GLOW 2011, Toulouse, France, August 30-31, 2011. Proceedings , 2011, ICT-GLOW.

[9]  Al Geist,et al.  PVM (Parallel Virtual Machine) , 2011, Encyclopedia of Parallel Computing.

[10]  Dong Li,et al.  PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.

[11]  Wilfred Pinfold,et al.  Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , 2009, HiPC 2009.

[12]  Emmanuel Agullo,et al.  Comparative study of one-sided factorizations with multiple software packages on multi-core hardware , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[13]  Robert A. van de Geijn,et al.  The libflame Library for Dense Matrix Computations , 2009, Computing in Science & Engineering.

[14]  Julien Langou,et al.  A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..

[15]  Feng Zhao,et al.  Fine-grained energy profiling for power-aware application design , 2008, PERV.

[16]  Robert A. van de Geijn,et al.  Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).

[17]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[18]  Emmanuel Jeannot,et al.  Compact DAG representation and its symbolic scheduling , 1999, J. Parallel Distributed Comput..

[19]  Emmanuel Jeannot,et al.  Compact DAG Representation and Its Dynamic Scheduling , 1999, J. Parallel Distributed Comput..

[20]  Jack Dongarra,et al.  LAPACK Users' Guide, 3rd ed. , 1999 .

[21]  L. Trefethen,et al.  Numerical linear algebra , 1997 .

[22]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[23]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[24]  James Demmel,et al.  ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[25]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[26]  G. Golub Matrix computations , 1983 .