Responsibly Reckless Matrix Algorithms for HPC Scientific Applications

High-performance computing (HPC) achieved an astonishing three orders of magnitude performance improvement per decade for three decades, thanks to hardware technology scaling resulting in an exponential improvement in the rate of floating point executions, though slowing in the most recent. Captured in the Top500 list, this hardware evolution cascaded through the software stack, triggering changes at all levels, including the redesign of numerical linear algebra libraries. HPC simulations on massively parallel systems are often driven by matrix computations, whose rate of execution depends on their floating point precision. Referred to by Jack Dongarra, the 2021 ACM A.M. Turing Award Laureate, as “responsibly reckless” matrix algorithms, we highlight the implications of mixed-precision (MP) computations for HPC applications. Introduced 75 years ago, long before the advent of HPC architectures, MP numerical methods turn out to be paramount for increasing the throughput of traditional and artificial intelligence (AI) workloads beyond riding the wave of the hardware alone. Reducing precision comes at the price of trading away some accuracy for performance (reckless behavior) but in noncritical segments of the workflow (responsible behavior) so that the accuracy requirements of the application can still be satisfied. They offer a valuable performance/accuracy knob and, just as they are in AI, they are now indispensable in the pursuit of knowledge and discovery in simulations. In particular, we illustrate the MP impact on three representative HPC applications related to seismic imaging, climate/environment geospatial predictions, and computational astronomy.

[1]  D. Keyes,et al.  Large-scale Marchenko imaging with distance-aware matrix reordering, tile low-rank compression, and mixed-precision computations , 2022, Second International Meeting for Applied Geoscience & Energy.

[2]  D. Keyes,et al.  Parallel space-time likelihood optimization for air pollution prediction on large-scale systems , 2022, PASC.

[3]  N. Higham,et al.  Mixed precision algorithms in numerical linear algebra , 2022, Acta Numerica.

[4]  Jack Dongarra,et al.  Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach With PaRSEC , 2022, IEEE Transactions on Parallel and Distributed Systems.

[5]  David E. Keyes,et al.  Accelerating Seismic Redatuming Using Tile Low-Rank Approximations on NEC SX-Aurora TSUBASA , 2021, Supercomput. Front. Innov..

[6]  Matteo Ravasi,et al.  An open-source framework for the implementation of large-scale integral operators with flexible, modern HPC solutions - Enabling 3D Marchenko imaging by least squares inversion , 2021 .

[7]  Jack Dongarra,et al.  A survey of numerical linear algebra methods utilizing mixed-precision arithmetic , 2021, Int. J. High Perform. Comput. Appl..

[8]  Marc G. Genton,et al.  Geostatistical Modeling and Prediction Using Mixed Precision Tile Cholesky Factorization , 2019, 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC).

[9]  J. Dongarra,et al.  SLATE: design of a modern distributed and accelerated linear algebra library , 2019, SC.

[10]  Francois Rigaut,et al.  Multiconjugate Adaptive Optics for Astronomy , 2018, Annual Review of Astronomy and Astrophysics.

[11]  Damien Gratadour,et al.  COMPASS: An Efficient GPU-based Simulation Software for Adaptive Optics Systems , 2018, 2018 International Conference on High Performance Computing & Simulation (HPCS).

[12]  David E. Keyes,et al.  Real-Time Massively Distributed Multi-object Adaptive Optics Simulations for the European Extremely Large Telescope , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[13]  Nicholas J. Higham,et al.  Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions , 2018, SIAM J. Sci. Comput..

[14]  David E. Keyes,et al.  ExaGeoStat: A High Performance Unified Software for Geostatistics on Manycore Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[15]  Eric Gendron,et al.  Adaptive Optics Simulation for the World's Largest Telescope on Multicore Architectures with Multiple GPUs , 2016, PASC.

[16]  Thomas Hérault,et al.  PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.

[17]  Thomas Hérault,et al.  Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[18]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[19]  Jack Dongarra,et al.  Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .

[20]  J. H. Wilkinson,et al.  IMPROVING THE ACCURACY OF COMPUTED EIGENVALUES AND EIGENVECTORS , 1983 .

[21]  K. M. Barry,et al.  RECOMMENDED STANDARDS FOR DIGITAL TAPE FORMATS , 1975 .

[22]  Cleve B. Moler,et al.  Iterative Refinement in Floating Point , 1967, JACM.

[23]  Evert Slob,et al.  MARCHENKO IMAGING , 2014 .

[24]  Jack Dongarra,et al.  ScaLAPACK Users' Guide , 1987 .

[25]  H. D. Huskey,et al.  NOTES ON THE SOLUTION OF ALGEBRAIC LINEAR SIMULTANEOUS EQUATIONS , 1948 .