The Effect of Various Sparsity Structures on Parallelism and Algorithms to Reveal Those Structures

Structured sparse matrices can greatly benefit parallel numerical methods in terms of parallel performance and convergence. In this chapter, we present combinatorial models for obtaining several different sparse matrix forms. There are four basic forms we focus on: singly-bordered block-diagonal form, doubly-bordered block-diagonal form, nonempty off-diagonal block minimization, and block diagonal with overlap form. For each of these forms, we first present the form in detail and describe what goals are sought within the form, and then examine the combinatorial models that attain the respective form while targeting the sought goals, and finally explain in which aspects the forms benefit certain parallel numerical methods and their relationship with the models. Our work focuses especially on graph and hypergraph partitioning models in obtaining the mentioned forms. Despite their relatively high preprocessing overhead compared to other heuristics, they have proven to model the given problem more accurately and this overhead can be often amortized due the fact that matrix structure does not change much during a typical numerical simulation. This chapter presents a number of models and their relationship with parallel numerical methods.

[1]  Andrew B. Kahng,et al.  Recent directions in netlist partitioning: a survey , 1995, Integr..

[2]  D. Medhi Parallel bundle-based decomposition for large-scale structured mathematical programming problems , 1990 .

[3]  Jean Roman,et al.  SCOTCH: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs , 1996, HPCN Europe.

[4]  Curt Jones,et al.  A Heuristic for Reducing Fill-In in Sparse Matrix Factorization , 1993, PPSC.

[5]  Cevdet Aykanat,et al.  Reducing latency cost in 2D sparse matrix partitioning models , 2016, Parallel Comput..

[6]  Gene H. Golub,et al.  A parallel balance scheme for banded linear systems , 2001, Numer. Linear Algebra Appl..

[7]  A. George Nested Dissection of a Regular Finite Element Mesh , 1973 .

[8]  Kadir Akbudak,et al.  Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors , 2016, IEEE Transactions on Parallel and Distributed Systems.

[9]  Bora Uçar,et al.  Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices , 2003, ISCIS.

[10]  J. Cong,et al.  Multi-way VLSI Circuit Partitioning Based On Dual Net Representation , 1994, IEEE/ACM International Conference on Computer-Aided Design.

[11]  Joseph W. H. Liu,et al.  Modification of the minimum-degree algorithm by multiple elimination , 1985, TOMS.

[12]  Bernard Philippe,et al.  An explicit formulation of the multiplicative Schwarz preconditioner , 2007 .

[13]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[14]  K. G. Murty,et al.  New iterative methods for linear inequalities , 1992 .

[15]  Ümit V. Çatalyürek,et al.  Permuting Sparse Rectangular Matrices into Block-Diagonal Form , 2004, SIAM J. Sci. Comput..

[16]  A. Sameh,et al.  A tearing-based hybrid parallel banded linear system solver , 2009 .

[17]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[18]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[19]  Ümit V. Çatalyürek,et al.  Hypergraph Partitioning-Based Fill-Reducing Ordering for Symmetric Matrices , 2011, SIAM J. Sci. Comput..

[20]  Chak-Kuen Wong,et al.  Covering edges by cliques with regard to keyword conflicts and intersection graphs , 1978, CACM.

[21]  Åke Björck,et al.  Numerical methods for least square problems , 1996 .

[22]  Bruce Hendrickson,et al.  Improving the Run Time and Quality of Nested Dissection Ordering , 1998, SIAM J. Sci. Comput..

[23]  George B. Dantzig,et al.  Decomposition Principle for Linear Programs , 1960 .

[24]  Cevdet Aykanat,et al.  Reordering sparse matrices into block-diagonal column-overlapped form , 2020, J. Parallel Distributed Comput..

[25]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[26]  Ümit V. Çatalyürek,et al.  PaToH: Partitioning Tool for Hypergraphs , 1999 .

[27]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[28]  Michael C. Ferris,et al.  Partitioning mathematical programs for parallel solution , 1998, Math. Program..

[29]  Bora Uçar,et al.  Parallel image restoration using surrogate constraint methods , 2007, J. Parallel Distributed Comput..

[30]  Ahmed Sameh,et al.  Hybrid Parallel Linear System Solvers , 1999 .

[31]  Yurii Nesterov,et al.  New variants of bundle methods , 1995, Math. Program..

[32]  Ümit V. Çatalyürek,et al.  Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication , 1999, IEEE Trans. Parallel Distributed Syst..

[33]  Bora Uçar,et al.  Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiplies , 2004, SIAM J. Sci. Comput..

[34]  J. W. Walker,et al.  Direct solutions of sparse network equations by optimally ordered triangular factorization , 1967 .

[35]  Thomas Lengauer,et al.  Combinatorial algorithms for integrated circuit layout , 1990, Applicable theory in computer science.

[36]  Michael Stonebraker,et al.  Standards for graph algorithm primitives , 2014, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[37]  A. Brandstädt,et al.  Graph Classes: A Survey , 1987 .

[38]  Patrick R. Amestoy,et al.  An Approximate Minimum Degree Ordering Algorithm , 1996, SIAM J. Matrix Anal. Appl..

[39]  Laura Grigori,et al.  A partitioning algorithm for block-diagonal matrices with overlap , 2008, Parallel Comput..

[40]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[41]  Ümit V. Çatalyürek,et al.  Partitioning Hypergraphs in Scientific Computing Applications through Vertex Separators on Graphs , 2012, SIAM J. Sci. Comput..

[42]  Deep Medhi,et al.  Bundle-based decomposition for large-scale convex optimization: Error estimate and application to block-angular linear programs , 1994, Math. Program..

[43]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[44]  Enver Kayaaslan,et al.  A Recursive Bipartitioning Algorithm for Permuting Sparse Square Matrices into Block Diagonal Form with Overlap , 2013, SIAM J. Sci. Comput..

[45]  Ümit V. Çatalyürek,et al.  Decomposing Linear Programs for Parallel Solution , 1995, PARA.

[46]  R. Freund,et al.  QMR: a quasi-minimal residual method for non-Hermitian linear systems , 1991 .

[47]  Curt Jones,et al.  Finding Good Approximate Vertex and Edge Partitions is NP-Hard , 1992, Inf. Process. Lett..

[48]  Murat Manguoglu,et al.  Parallel Minimum Norm Solution of Sparse Block Diagonal Column Overlapped Underdetermined Systems , 2017, ACM Trans. Math. Softw..