Acceleration of conjugate gradient method for circuit simulation using CUDA

The Conjugate Gradient method is a popular iterative method to solve a system of linear equations and is used in a variety of applications. The DC Analyser is a circuit simulator built at IIT Bombay to solve large circuits containing resistances, voltage and current sources and which employs the conjugate gradient method. Current generation of graphics cards offer extremely high raw processing power and memory bandwidths compared to conventional CPUs. We have accelerated the conjugate gradient part of the DC Analyser using an Nvidia GTX 280 GPU and the new CUDA technology and successfully obtained a speedup of over 10x for the CG method and more than 4x for the entire application for very large circuits when compared to a single-threaded CPU implementation.

[1]  James Demmel,et al.  Benchmarking GPUs to tune dense linear algebra , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Samuel Williams,et al.  Optimization of sparse matrix-vector multiplication on emerging multicore platforms , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[3]  Jianhua Li,et al.  A multilevel eigenvalue based circuit partitioning technique , 2005, Fifth International Workshop on System-on-Chip for Real-Time Applications (IWSOC'05).

[4]  Guillaume Caumon,et al.  Concurrent Number Cruncher: An Efficient Sparse Linear Solver on the GPU , 2007, HPCC.

[5]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.

[6]  Madhav P. Desai,et al.  Fast DC analysis and its application to combinatorial optimization problems , 2006, 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06).

[7]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[8]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[9]  Robert Strzodka,et al.  Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations , 2007, Int. J. Parallel Emergent Distributed Syst..

[10]  Eitan Grinspun,et al.  Sparse matrix solvers on the GPU: conjugate gradients and multigrid , 2003, SIGGRAPH Courses.

[11]  Michael Garland,et al.  Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.