Machine Learning-Aided Numerical Linear Algebra: Convolutional Neural Networks for the Efficient Preconditioner Generation

Generating sparsity patterns for effective block-Jacobi preconditioners is a challenging and computationally expensive task, in particular for problems with unknown origin. In this paper we design a convolutional neural network (CNN) to detect natural block structures in matrix sparsity patterns. For test matrices where a natural block structure is complemented with a random distribution of nonzeros (noise), we show that a trained network succeeds in identifying strongly connected components with more than 95% prediction accuracy, and the resulting block-Jacobi preconditioner effectively accelerating an iterative GMRES solver. Segmenting a matrix into diagonal tiles of size 128x128, for each tile the sparsity pattern of an effective block-Jacobi preconditioner can be generated in less than a millisecond when using a production-line GPU.

[1]  Jennifer A. Scott,et al.  Using Jacobi iterations and blocking for solving sparse triangular systems in incomplete factorization preconditioning , 2018, J. Parallel Distributed Comput..

[2]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[3]  Enrique S. Quintana-Ortí,et al.  Variable-size batched Gauss-Jordan elimination for block-Jacobi preconditioning on graphics processors , 2019, Parallel Comput..

[4]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[5]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[6]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Wang,et al.  In-Datacenter Performance Analysis of a Tensor Processing UnitTM , .

[11]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[13]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[14]  Enrique S. Quintana-Ortí,et al.  Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning , 2017, ICCS.

[15]  Yue Zhao,et al.  Bridging the gap between deep learning and sparse matrix format selection , 2018, PPoPP.

[16]  Enrique S. Quintana-Ortí,et al.  Batched Gauss-Jordan Elimination for Block-Jacobi Preconditioner Generation on GPUs , 2017, PMAM@PPoPP.

[17]  Enrique S. Quintana-Ortí,et al.  Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning , 2017, 2017 46th International Conference on Parallel Processing (ICPP).

[18]  Ji Wu,et al.  Study on a Poisson's equation solver based on deep learning technique , 2017, 2017 IEEE Electrical Design of Advanced Packaging and Systems Symposium (EDAPS).

[19]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[20]  Markus Hegland,et al.  Block jacobi preconditioning of the conjugate gradient method on a vector processor , 1992 .

[21]  Christoph Meinel,et al.  Deep Learning for Medical Image Analysis , 2018, Journal of Pathology Informatics.

[22]  Jonas Adler,et al.  Solving ill-posed inverse problems using iterative deep neural networks , 2017, ArXiv.

[23]  Timothy Dozat,et al.  Incorporating Nesterov Momentum into Adam , 2016 .