A Compression Method for Storage Formats of a Sparse Matrix in Solving the Large-Scale Linear Systems

The finite element method (FEM) is used in a variety of numerical simulations to solve partial differential equations (PDEs) in solving a large linear system of equations at one of the computational phases. The cost of solving a large linear system with a linear solver often overwhelms with other computational phases in FEM analysis. For this reason, graphics processing units (GPUs) are widely adopted in iterative linear solvers such as generalized minimum residual (GMRES) to speed up the analysis. Nevertheless, there are two major drawbacks in iterative linear solvers in GPU. First, the system matrix does not fit in GPU memory as the scale of simulation increases. In such cases, we may face a abnormal slowdown of the GPU due to data transfer between the main memory and GPU memory. Second, the sparse matrix-vector product (SpMV) in the solvers is a hotspot because SpMV requires a lot of indirect memory access for non-zero elements of the sparse matrix. To solve these problems, a compression method for the sparse matrix storage format is needed as a reduction method for both required memory space and memory access. In this paper, we propose an improved compression method for conventional sparse matrix storage formats such as compressed sparse row (CSR) and ELLPACK (ELL). Assuming that part of the column indexes in those formats is consecutive and such part can be described with its minimum and maximum values. Such consecutive indexes will be compressed into two integers. Using this idea, the partial sum of the product for that part can be calculated without having to load those consecutive indexes from memory. Thereby, this compression method reduces the memory usage and memory access. In our experiments, our compression method could also reduce the memory usage 8 out of 10 matrices in CSR and ELL. In particular, the memory reduction ratio of the pwtk matrix is up to 26.6% in CSR. Furthermore, our compression method reduced the execution time of SpMV compared with CSR on various matrix.