Sparse matrix permutations to a block triangular form in a distributed environment

Arranging the sparse circuit matrix into a diagonal block upper triangular form is the first step of the KLU algorithm. This paper presents the two steps of the parallel algorithm, running in a distributed environment, that performs unsymmetric and symmetric permutations of the matrix's rows. First, using the [Duff] maximum transversal algorithm and performing asymmetrical permutations, the matrix is shaped to achieve a zero free diagonal. Then, searching the strongly connected components of the associated matrix's graph, and performing symmetric permutation, the sparse matrix is shaped in a diagonal block upper triangular form. Both algorithm and architecture are presented.

[1]  Mark Zwolinski,et al.  Parallel sparse matrix solver for direct circuit simulations on FPGAs , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[2]  Bora Uçar,et al.  Design, implementation, and analysis of maximum transversal algorithms , 2011, ACM Trans. Math. Softw..

[3]  Timothy A. Davis,et al.  Unsymmetric-pattern multifrontal methods for parallel sparse LU factorization , 1991 .

[4]  Nicholas I. M. Gould,et al.  A numerical evaluation of sparse direct solvers for the solution of large sparse symmetric linear systems of equations , 2007, TOMS.

[5]  A. DeHon,et al.  Parallelizing sparse Matrix Solve for SPICE circuit simulation using FPGAs , 2009, 2009 International Conference on Field-Programmable Technology.

[6]  To-Yat Cheung Graph Traversal Techniques and the Maximum Flow Problem in Distributed Computation , 1983, IEEE Trans. Software Eng..

[7]  Iain S. Duff,et al.  On Algorithms for Obtaining a Maximum Transversal , 1981, TOMS.

[8]  Iain S. Duff,et al.  The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices , 1999, SIAM J. Matrix Anal. Appl..

[9]  John K. Reid,et al.  An Implementation of Tarjan's Algorithm for the Block Triangularization of a Matrix , 1978, TOMS.

[10]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[11]  Liu Li,et al.  A Highly Efficient GPU-CPU Hybrid Parallel Implementation of Sparse LU Factorization , 2012 .

[12]  Timothy A. Davis,et al.  Algorithm 907 , 2010 .

[13]  Wayne B. Hayes,et al.  Algorithm 908 , 2010 .

[14]  Nathan Linial,et al.  Locality in Distributed Graph Algorithms , 1992, SIAM J. Comput..

[15]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[16]  John R. Gilbert,et al.  High-Performance Graph Algorithms from Parallel Sparse Matrices , 2006, PARA.

[17]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[18]  Ekanathan Palamadai Natarajan,et al.  KLU{A HIGH PERFORMANCE SPARSE LINEAR SOLVER FOR CIRCUIT SIMULATION PROBLEMS , 2005 .

[19]  Anshul Gupta,et al.  Fast and effective algorithms for graph partitioning and sparse-matrix ordering , 1997, IBM J. Res. Dev..

[20]  Timothy A. Davis,et al.  A combined unifrontal/multifrontal method for unsymmetric sparse matrices , 1999, TOMS.