On Two-Dimensional Sparse Matrix Partitioning: Models, Methods, and a Recipe

We consider two-dimensional partitioning of general sparse matrices for parallel sparse matrix-vector multiply operation. We present three hypergraph-partitioning-based methods, each having unique advantages. The first one treats the nonzeros of the matrix individually and hence produces fine-grain partitions. The other two produce coarser partitions, where one of them imposes a limit on the number of messages sent and received by a single processor, and the other trades that limit for a lower communication volume. We also present a thorough experimental evaluation of the proposed two-dimensional partitioning methods together with the hypergraph-based one-dimensional partitioning methods, using an extensive set of public domain matrices. Furthermore, for the users of these partitioning methods, we present a partitioning recipe that chooses one of the partitioning methods according to some matrix characteristics.

[1]  Thomas Lengauer,et al.  Combinatorial algorithms for integrated circuit layout , 1990, Applicable theory in computer science.

[2]  Harry Berryman,et al.  Performance Effects of Irregular Communication Patterns on Massively Parallel Multiprocessors , 1991, J. Parallel Distributed Comput..

[3]  Tevfik Bultan,et al.  A New Mapping Heuristic Based on Mean Field Annealing , 1992, J. Parallel Distributed Comput..

[4]  William Aiello,et al.  Sparse Matrix Computations on Parallel Processor Arrays , 1993, SIAM J. Sci. Comput..

[5]  J. G. Lewis,et al.  Distributed memory matrix-vector multiplication and conjugate gradient algorithms , 1993, Supercomputing '93.

[6]  J.G. Lewis,et al.  Matrix-vector multiplication and conjugate gradient algorithms on distributed memory computers , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[7]  Steven J. Plimpton,et al.  Massively parallel methods for engineering and science problems , 1994, CACM.

[8]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[9]  David M. Nicol,et al.  Rectilinear Partitioning of Irregular Data Parallel Computations , 1994, J. Parallel Distributed Comput..

[10]  Olivier C. Martin,et al.  Partitioning of unstructured meshes for load balancing , 1995, Concurr. Pract. Exp..

[11]  Emilio L. Zapata,et al.  Data Distributions for Sparse Matrix Vector Multiplication , 1995, Parallel Comput..

[12]  Laxmi N. Bhuyan,et al.  Mapping Molecular Dynamics Computations on to Hypercubes , 1995, Parallel Comput..

[13]  Sanjay Ranka,et al.  Partitioning unstructured computational graphs for nonunifor , 1995, IEEE Parallel & Distributed Technology: Systems & Applications.

[14]  Ümit V. Çatalyürek,et al.  Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication , 1996, IRREGULAR.

[15]  Tor Sørevik,et al.  Partitioning an Array onto a Mesh of Processors , 1996, PARA.

[16]  Bruce Hendrickson,et al.  Skewed Graph Partitioning , 1997, PP.

[17]  Cevdet Aykanat,et al.  Sparse matrix decomposition with optimal load balancing , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[18]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[19]  Vipin Kumar,et al.  Multilevel Algorithms for Multi-Constraint Graph Partitioning , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[20]  Vipin Kumar,et al.  A New Algorithm for Multi-objective Graph Partitioning , 1999, Euro-Par.

[21]  Joel H. Saltz,et al.  Object-Relational Queries into Multidimensional Databases with the Active Data Repository , 1999, Parallel Process. Lett..

[22]  Dhabaleswar K. Panda,et al.  All-to-all broadcast on switch-based clusters of workstations , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[23]  Ümit V. Çatalyürek Hypergraph models for sparse matrix partitioning and reordering , 1999 .

[24]  Ümit V. Çatalyürek,et al.  Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication , 1999, IEEE Trans. Parallel Distributed Syst..

[25]  Tamara G. Kolda,et al.  Graph partitioning models for parallel computing , 2000, Parallel Comput..

[26]  Vipin Kumar,et al.  Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning (Distinguished Paper) , 2000, Euro-Par.

[27]  Tamara G. Kolda,et al.  Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing , 1999, SIAM J. Sci. Comput..

[28]  Joel H. Saltz,et al.  A Hypergraph-Based Workload Partitioning Strategy for Parallel Data Aggregation , 2001, PPSC.

[29]  Ümit V. Çatalyürek,et al.  A fine-grain hypergraph model for 2D decomposition of sparse matrices , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[30]  Joel H. Saltz,et al.  Distributed processing of very large datasets with DataCutter , 2001, Parallel Comput..

[31]  Ümit V. Çatalyürek,et al.  A Hypergraph-Partitioning Approach for Coarse-Grain Decomposition , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[32]  Jorge J. Moré,et al.  Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .

[33]  Bora Uçar,et al.  Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices , 2003, ISCIS.

[34]  Bora Uçar,et al.  Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiplies , 2004, SIAM J. Sci. Comput..

[35]  Tahsin M. Kurç,et al.  Image-Space Decomposition Algorithms for Sort-First Parallel Volume Rendering of Unstructured Grids , 2004, The Journal of Supercomputing.

[36]  Cevdet Aykanat,et al.  Fast optimal load balancing algorithms for 1D partitioning , 2004, J. Parallel Distributed Comput..

[37]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[38]  Rob H. Bisseling,et al.  Communication balancing in parallel sparse matrix-vector multiplication , 2005 .

[39]  C. Aykanat A Library for Parallel Sparse Matrix Vector Multiplies , 2005 .

[40]  Brendan Vastenhouw,et al.  A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication , 2005, SIAM Rev..

[41]  P. Sadayappan,et al.  Hypergraph Partitioning for Automatic Memory Hierarchy Management , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[42]  Bora Uçar Heuristics for a Matrix Symmetrization Problem , 2007, PPAM.

[43]  Ümit V. Çatalyürek,et al.  Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[44]  Bora Uçar,et al.  Revisiting Hypergraph Models for Sparse Matrix Partitioning , 2007, SIAM Rev..

[45]  Bora Uçar,et al.  Partitioning Sparse Matrices for Parallel Preconditioned Iterative Methods , 2007, SIAM J. Sci. Comput..

[46]  Berkant Barla Cambazoglu,et al.  Multi-level direct K-way hypergraph partitioning with multiple constraints and fixed vertices , 2008, J. Parallel Distributed Comput..

[47]  Bora Uçar,et al.  A Matrix Partitioning Interface to PaToH in MATLAB , 2010, Parallel Comput..

[48]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.