QTBiCGSTAB Algorithm for Large Linear Systemand Parallelized

According to the traditional stabilized biconjugate gradient algorithm (BiCGSTAB) deficiency in data locality, this paper proposed a QTBiCGSTAB algorithm whose core idea is that recursively divides sparse matrix with quarter tree into sub-matrix and reorders them, to improve the hit ratio of cache and enhance the algorithm’s efficiency. And the idea is good for algorithm being parallized, that is proved by the numerical experiments later. It mainly shows, firstly, QTBiCGSTAB algorithm is more efficiency than BiCGSTAB, and the speedup would reach 1:330. The target division length would be influnced on the algorithm’s performance; Secondly, for large linear system, parallized QTBiCGSTAB is more efficiency than serial’s.

[1]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[2]  Fang Fang,et al.  Simultaneous Localization and Mapping in a Hybrid Robot and Camera Network System , 2020, J. Intell. Robotic Syst..

[3]  I. I. Bosikova,et al.  Direct methods of solving systems of linear algebraic ef ions with complex a-matrices , 2000 .

[4]  D. Birchall,et al.  Computational Fluid Dynamics , 2020, Radial Flow Turbocompressors.

[5]  Henk A. van der Vorst,et al.  Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..

[6]  Pavel Zemcík,et al.  Efficient implementation for block matrix operations for nonlinear least squares problems in robotic applications , 2013, 2013 IEEE International Conference on Robotics and Automation.

[7]  Xiaoyan Li,et al.  Dynamics analysis of bladder-urethra system based on CFD , 2010 .

[8]  R. Prada,et al.  Well-organized preconditioner for solving load flow problems by GMRES , 2011, 2011 International Conference & Utility Exhibition on Power and Energy Systems: Issues and Prospects for Asia (ICUE).

[9]  Sebastian Kestler,et al.  An Efficient Approximate Residual Evaluation in the Adaptive Tensor Product Wavelet Method , 2013, J. Sci. Comput..

[10]  Zhuo Feng,et al.  Multigrid on GPU: Tackling Power Grid Analysis on parallel SIMT platforms , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[11]  Yici Cai,et al.  GPU friendly Fast Poisson Solver for structured power grid network analysis , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[12]  Zhiyu Zeng,et al.  Parallel multigrid preconditioning on graphics processing units (GPUs) for robust power grid analysis , 2010, Design Automation Conference.

[13]  Charlie Chung-Ping Chen,et al.  Efficient large-scale power grid analysis based on preconditioned Krylov-subspace iterative methods , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[14]  Roland Siegwart,et al.  Simultaneous localization and odometry self calibration for mobile robot , 2007, Auton. Robots.

[15]  Zhiyu Zeng,et al.  Robust Parallel Preconditioned Power Grid Simulation on GPU With Adaptive Runtime Performance Modeling and Optimization , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[16]  Xinzheng Zhang,et al.  A Robust Regression Model for Simultaneous Localization and Mapping in Autonomous Mobile Robot , 2008, J. Intell. Robotic Syst..

[17]  Fang Chen,et al.  Additive block diagonal preconditioning for block two-by-two linear systems of skew-Hamiltonian coefficient matrices , 2013, Numerical Algorithms.

[18]  Vitaliy Popov,et al.  Non-Newtonian flow of pathological bile in the biliary system: experimental investigation and CFD simulations , 2014, Korea-Australia Rheology Journal.

[19]  Steven Deutsch,et al.  Assessment of CFD Performance in Simulations of an Idealized Medical Device: Results of FDA’s First Computational Interlaboratory Study , 2012 .

[20]  YANQING CHEN,et al.  Algorithm 8 xx : CHOLMOD , supernodal sparse Cholesky factorization and update / downdate ∗ , 2006 .

[21]  Günter Mayer,et al.  Direct methods for linear systems with inexact input data , 2009 .