论文信息 - CJS: Custom Jacobi Solver

CJS: Custom Jacobi Solver

The classic Jacobi method, widely used for solving linear systems, is slow, especially when dealing with large matrices. This paper proposes a Custom Jacobi Solver (CJS) for large-scale linear systems. It is based on a column-wise Jacobi step operation which allows for increased dependence distance, enabling deep pipelining. Our solver allows customisation at run time between the classic Jacobi method and its more convergence efficient-counterpart, the weighted Jacobi method. It can be dynamically scaled to multiple FPGAs by appropriately partitioning the matrix data among the FPGAs. After evaluating our solver on a number of different datasets, CJS proves to be up to 71 times faster when comparing an 8-FPGA solution with a 12-core CPU C++ implementation.

[1] Rodrigo Weber dos Santos,et al. Comparing CUDA and OpenGL implementations for a Jacobi iteration , 2009, 2009 International Conference on High Performance Computing & Simulation.

[2] Germán Hernández,et al. Clustering Algorithms for Risk-Adjusted Portfolio Construction , 2017, ICCS.

[3] Ying Chen,et al. Design and Implementation of Jacobi Algorithms on GPU , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[4] Tao Wang,et al. Implementation of Jacobi iterative method on graphics processor unit , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[5] Guangwen Yang,et al. Jacobi Solver: A Fast FPGA-based Engine System for Jacobi Method , 2013 .

[6] Ali N. Akansu,et al. FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis , 2016, J. Parallel Distributed Comput..

[7] Maxim A. Olshanskii,et al. Iterative Methods for Linear Systems - Theory and Applications , 2014 .

[8] Dean R. De Cock,et al. Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project , 2011 .

[9] Viktor K. Prasanna,et al. An FPGA-based floating-point Jacobi iterative solver , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).

[10] D. O’Leary,et al. Multi-Splittings of Matrices and Parallel Solution of Linear Systems , 1985 .

[11] Ali N. Akansu,et al. Novel GPU implementation of Jacobi algorithm for Karhunen-Loève transform of dense matrices , 2012, 2012 46th Annual Conference on Information Sciences and Systems (CISS).

[12] Tetsuya Sakurai,et al. A parameter optimization technique for a weighted Jacobi-type preconditioner , 2012, JSIAM Letters.