Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning

The multiplication of a vector by a sparse matrix is an important kernel in scientific computing. We study how to optimize the performance of this operation in parallel by reducing communication. We review existing approaches and present a new two-dimensional partitioning method for symmetric matrices, called corner partitioning. Our method is simple and can be implemented using existing software for hypergraph partitioning. Experimental results show our method often produces better quality than traditional one-dimensional partitioning methods and is competitive with two-dimensional methods. It is also fast to compute. Finally, we propose a graph model for an ordering problem to further optimize our approach. This leads to a graph algorithm based on vertex cover or vertex separator.