Data distribution schemes of sparse arrays on distributed memory multicomputers

A data distribution scheme of sparse arrays on a distributed memory multicomputer in general, is composed of three phases: data partition, data distribution, and data compression. To implement the data distribution scheme, methods proposed in the literature first perform the data partition phase, then the data distribution phase, followed by the data compression phase. We call this scheme as send followed compress (SFC) scheme. In this paper, we propose two other data distribution schemes, compress followed send (CFS) and encoding-decoding (ED), for sparse array distribution. In the CFS scheme, the data compression phase is performed before the data distribution phase. In the ED scheme, the data compression phase can be divided into two steps, encoding and decoding. The encoding step and the decoding step are performed before and after the data distribution phase, respectively. To evaluate the CFS and the ED schemes, we compare them with the SFC scheme. In the theoretical analysis, we analyze the SFC, CFS, and ED schemes in terms of the data distribution time and the data compression time. In the experimental test, we implemented these schemes on an IBM SP2 parallel machine. From the experimental results, for most of the test cases, the CFS and ED schemes outperform the SFC scheme. For the CFS and ED schemes, the ED scheme outperforms the CFS scheme for all the test cases.

[1]  Shahid H. Bokhari,et al.  A Partitioning Strategy for Nonuniform Problems on Multiprocessors , 1987, IEEE Transactions on Computers.

[2]  C.W. Kessler,et al.  The SPARAMAT approach to automatic comprehension of sparse matrix computations , 1999, Proceedings Seventh International Workshop on Program Comprehension.

[3]  Yeh-Ching Chung,et al.  Efficient parallel algorithms for multi-dimensional matrix operations , 2000, Proceedings International Symposium on Parallel Architectures, Algorithms and Networks. I-SPAN 2000.

[4]  A. Pinar,et al.  Improving Performance of Sparse Matrix-Vector Multiplication , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[5]  Jenq Kuen Lee,et al.  Parallel Sparse Supports for Array Intrinsic Functions of Fortran 90 , 2001, The Journal of Supercomputing.

[6]  Keshav Pingali,et al.  Next-generation generic programming and its application to sparse matrix computations , 2000, ICS '00.

[7]  Kanad Ghose,et al.  Caching-efficient multithreaded fast multiplication of sparse matrices , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[8]  K. Pingali,et al.  Compiling Parallel Code for Sparse Matrix Applications , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[9]  William H. Press,et al.  Numerical recipes in Fortran 90: the art of parallel scientific computing, 2nd Edition , 1996, Fortran numerical recipes.

[10]  Barbara M. Chapman,et al.  New data-parallel language features for sparse matrix computations , 1995, Proceedings of 9th International Parallel Processing Symposium.

[11]  Rafael Asenjo,et al.  HPF-2 Support for Dynamic Sparse Computations , 1998, LCPC.

[12]  Rafael Asenjo,et al.  Sparse Block and Cyclic Data Distributions for Matrix Computations , 1995 .

[13]  P. Sadayappan,et al.  On improving the performance of sparse matrix-vector multiplication , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[14]  Emilio L. Zapata,et al.  Sparse matrix block-cyclic redistribution , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[15]  Chun-Yuan Lin,et al.  Efficient Data Compression Methods for Multidimensional Sparse Array Operations Based on the EKMR Scheme , 2003, IEEE Trans. Computers.

[16]  Chun-Yuan Lin,et al.  Efficient Representation Scheme for Multidimensional Array Operations , 2002, IEEE Trans. Computers.

[17]  Barbara M. Chapman,et al.  Vienna-Fortran/HPF Extensions for Sparse and Irregular Problems and Their Compilation , 1997, IEEE Trans. Parallel Distributed Syst..

[18]  J. Cullum,et al.  Lanczos algorithms for large symmetric eigenvalue computations , 1985 .

[19]  Iain S. Duff,et al.  Sparse matrix test problems , 1982 .

[20]  Boleslaw K. Szymanski,et al.  Run-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines , 1994, PARLE.

[21]  Rafael Asenjo,et al.  Data-parallel support for numerical irregular problems , 1999, Parallel Comput..

[22]  Gene H. Golub,et al.  Matrix computations , 1983 .

[23]  Joel H. Saltz,et al.  Parallelization Techniques for Sparse Matrix Applications , 1996, J. Parallel Distributed Comput..

[24]  Chun-Yuan Lin,et al.  Efficient Data Parallel Algorithms for Multidimensional Array Operations Based on the EKMR Scheme for Distributed Memory Multicomputers , 2003, IEEE Trans. Parallel Distributed Syst..

[25]  Richard Barrett,et al.  Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.