Geostatistical Modeling and Prediction Using Mixed Precision Tile Cholesky Factorization

Geostatistics represents one of the most challenging classes of scientific applications due to the desire to incorporate an ever increasing number of geospatial locations to accurately model and predict environmental phenomena. For example, the evaluation of the Gaussian log-likelihood function, which constitutes the main computational phase, involves solving systems of linear equations with a large dense symmetric and positive definite covariance matrix. Cholesky, the standard algorithm, requires O(n^3) floating point operators and has an O(n^2) memory footprint, where n is the number of geographical locations. Here, we present a mixed-precision tile algorithm to accelerate the Cholesky factorization during the log-likelihood function evaluation. Under an appropriate ordering, it operates with double-precision arithmetic on tiles around the diagonal, while reducing to single-precision arithmetic for tiles sufficiently far off. This translates into an improvement of the performance without any deterioration of the numerical accuracy of the application. We rely on the StarPU dynamic runtime system to schedule the tasks and to overlap them with data movement. To assess the performance and the accuracy of the proposed mixed-precision algorithm, we use synthetic and real datasets on various shared and distributed-memory systems possibly equipped with hardware accelerators. We compare our mixed-precision Cholesky factorization against the double-precision reference implementation as well as an independent block approximation method. We obtain an average of 1.6X performance speedup on massively parallel architectures while maintaining the accuracy necessary for modeling and prediction.

[1]  David E. Keyes,et al.  Parallel Approximation of the Maximum Likelihood Estimation for the Prediction of Large-Scale Geostatistics Simulations , 2018, 2018 IEEE International Conference on Cluster Computing (CLUSTER).

[2]  David Kainer,et al.  Attacking the Opioid Epidemic: Determining the Epistatic and Pleiotropic Genetic Architectures for Chronic Pain and Opioid Addiction , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[3]  Jordan G. Powers,et al.  A Description of the Advanced Research WRF Version 2 , 2005 .

[4]  Chao Yang,et al.  Accelerating solvers for global atmospheric equations through mixed-precision data flow engine , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[5]  Ivan V. Oseledets,et al.  Preconditioners for hierarchical matrices based on their extended sparse form , 2014, Russian Journal of Numerical Analysis and Mathematical Modelling.

[6]  Alexander Litvinenko,et al.  Likelihood approximation with hierarchical matrices for large spatial datasets , 2017, Comput. Stat. Data Anal..

[7]  David E. Keyes,et al.  Exploiting Data Sparsity for Large-Scale Matrix Computations , 2018, Euro-Par.

[8]  Jack Dongarra,et al.  Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .

[9]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[10]  Peter D. Düben,et al.  On the use of scale‐dependent precision in Earth System modelling , 2017 .

[11]  Michael L. Stein,et al.  Limitations on low rank approximations for covariance matrices of spatial data , 2014 .

[12]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[13]  Ying Sun,et al.  Geostatistics for Large Datasets , 2012 .

[14]  Peter D. Düben,et al.  The use of imprecise processing to improve accuracy in weather & climate prediction , 2014, J. Comput. Phys..

[15]  Bruno Sinopoli,et al.  Kalman filtering with intermittent observations , 2004, IEEE Transactions on Automatic Control.

[16]  Eric Darve,et al.  Sparse Supernodal Solver Using Block Low-Rank Compression , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[17]  Chong Gu,et al.  Smoothing spline Gaussian regression: more scalable computation via efficient approximation , 2004 .

[18]  Nicholas J. Higham,et al.  Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[19]  Wolfgang Hackbusch,et al.  A Sparse Matrix Arithmetic Based on H-Matrices. Part I: Introduction to H-Matrices , 1999, Computing.

[20]  David E. Keyes,et al.  ExaGeoStat: A High Performance Unified Software for Geostatistics on Manycore Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[21]  Julien Langou,et al.  Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems , 2007, Int. J. High Perform. Comput. Appl..

[22]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[23]  Michael L. Stein,et al.  Statistical Properties of Covariance Tapers , 2013 .

[24]  Ronald P. Barry,et al.  Flexible Spatial Models for Kriging and Cokriging Using Moving Averages and the Fast Fourier Transform (FFT) , 2004 .

[25]  David E. Keyes,et al.  Real-Time Massively Distributed Multi-object Adaptive Optics Simulations for the European Extremely Large Telescope , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[26]  Prabhat,et al.  Exascale Deep Learning for Climate Analytics , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[27]  Huang Huang,et al.  Hierarchical Low Rank Approximation of Likelihoods for Large Spatial Datasets , 2016, 1605.08898.

[28]  Jack J. Dongarra,et al.  Solving the Generalized Symmetric Eigenvalue Problem using Tile Algorithms on Multicore Architectures , 2011, PARCO.

[29]  Eric Darve,et al.  A fast block low-rank dense solver with applications to finite-element matrices , 2014, J. Comput. Phys..

[30]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[31]  Sven Christophersen,et al.  Approximation of integral operators by Green quadrature and nested cross approximation , 2014, Numerische Mathematik.

[32]  Samuel Williams,et al.  An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling , 2015, SIAM J. Sci. Comput..