CJS: Custom Jacobi Solver

The classic Jacobi method, widely used for solving linear systems, is slow, especially when dealing with large matrices. This paper proposes a Custom Jacobi Solver (CJS) for large-scale linear systems. It is based on a column-wise Jacobi step operation which allows for increased dependence distance, enabling deep pipelining. Our solver allows customisation at run time between the classic Jacobi method and its more convergence efficient-counterpart, the weighted Jacobi method. It can be dynamically scaled to multiple FPGAs by appropriately partitioning the matrix data among the FPGAs. After evaluating our solver on a number of different datasets, CJS proves to be up to 71 times faster when comparing an 8-FPGA solution with a 12-core CPU C++ implementation.

[1]  Rodrigo Weber dos Santos,et al.  Comparing CUDA and OpenGL implementations for a Jacobi iteration , 2009, 2009 International Conference on High Performance Computing & Simulation.

[2]  Germán Hernández,et al.  Clustering Algorithms for Risk-Adjusted Portfolio Construction , 2017, ICCS.

[3]  Ying Chen,et al.  Design and Implementation of Jacobi Algorithms on GPU , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[4]  Tao Wang,et al.  Implementation of Jacobi iterative method on graphics processor unit , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[5]  Guangwen Yang,et al.  Jacobi Solver: A Fast FPGA-based Engine System for Jacobi Method , 2013 .

[6]  Ali N. Akansu,et al.  FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis , 2016, J. Parallel Distributed Comput..

[7]  Maxim A. Olshanskii,et al.  Iterative Methods for Linear Systems - Theory and Applications , 2014 .

[8]  Dean R. De Cock,et al.  Ames, Iowa: Alternative to the Boston Housing Data as an End of Semester Regression Project , 2011 .

[9]  Viktor K. Prasanna,et al.  An FPGA-based floating-point Jacobi iterative solver , 2005, 8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05).

[10]  D. O’Leary,et al.  Multi-Splittings of Matrices and Parallel Solution of Linear Systems , 1985 .

[11]  Ali N. Akansu,et al.  Novel GPU implementation of Jacobi algorithm for Karhunen-Loève transform of dense matrices , 2012, 2012 46th Annual Conference on Information Sciences and Systems (CISS).

[12]  Tetsuya Sakurai,et al.  A parameter optimization technique for a weighted Jacobi-type preconditioner , 2012, JSIAM Letters.