Data-Driven Performance Modeling of Linear Solvers for Sparse Matrices

Performance of scientific codes is increasingly dependent on the input problem, its data representation and the underlying hardware with the increase in code and architectural complexity. This makes the task of identifying the fastest algorithm for solving a problem more challenging. In this paper, we focus on modeling the performance of numerical libraries used to solve a sparse linear system. We use machine learning to develop data-driven models of performance of linear solver implementations. These models can be used by a novice user to identify the fastest preconditioner and solver for a given input matrix. We use a variety of features that represent the matrix structure, numerical properties of the matrix and the underlying mesh or input problem as input to the model. We model the performance of nine linear solvers and thirteen preconditioners available in Trilinos using 1240 sparse matrices obtained from two different sources. Our prediction models perform significantly better than a blind classifier and black-box SVM and k-NN classifiers.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Mark M. Mathis,et al.  A performance model of non-deterministic particle transport on large-scale systems , 2003, Future Gener. Comput. Syst..

[3]  It Informatics,et al.  Portable, Extensible Toolkit for Scientific Computation , 2012 .

[4]  Sanjukta Bhowmick,et al.  Towards Low-Cost, High-Accuracy Classifiers for Linear Solver Selection , 2009, ICCS.

[5]  Laxmikant V. Kalé,et al.  Identifying the Culprits Behind Network Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[6]  Elizabeth R. Jessup,et al.  Automating the generation of composed linear algebra kernels , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[7]  Laxmikant V. Kalé,et al.  Predicting application performance using supervised learning on communication features , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[8]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[9]  William Gropp,et al.  An introductory exascale feasibility study for FFTs and multigrid , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Adolfy Hoisie,et al.  Palm: easing the burden of analytical performance modeling , 2014, ICS '14.

[12]  Adolfy Hoisie,et al.  Use of Predictive Performance Modeling during Large-scale System Installation , 2005, Parallel Process. Lett..

[13]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[14]  Elizabeth R. Jessup,et al.  Lighthouse: A User-Centered Web Service for Linear Algebra Software , 2014, ArXiv.

[15]  Tamara G. Kolda,et al.  An overview of the Trilinos project , 2005, TOMS.

[16]  Martin Schulz,et al.  Modeling the performance of an algebraic multigrid cycle on HPC platforms , 2011, ICS '11.

[17]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[18]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[19]  Fabrizio Petrini,et al.  Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[20]  Martin Schulz,et al.  Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP , 2012, 2012 41st International Conference on Parallel Processing.

[21]  Boyana Norris,et al.  Generating Customized Sparse Eigenvalue Solutions with Lighthouse , 2014 .

[22]  Laxmikant V. Kalé,et al.  Architectural Constraints to Attain 1 Exaflop/s for Three Scientific Application Classes , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.