Robust Parallel Preconditioned Power Grid Simulation on GPU With Adaptive Runtime Performance Modeling and Optimization

Leveraging the power of nowadays graphics processing units for robust power grid simulation remains a challenging task. Existing preconditioned iterative methods that require incomplete matrix factorizations cannot be effectively accelerated on graphics processing unit (GPU) due to its limited hardware resource as well as data parallel computing. This paper presents an efficient GPU-based multigrid preconditioning algorithm for robust power grid analysis. By combining the fast geometric multigrid solver with the robust Krylov-subspace iterative solver, power grid DC and transient analysis can be performed efficiently on GPU without loss of accuracy (largest errors <;0.5 mV). Unlike previous GPU-based algorithms that rely on good power grid regularities, the proposed algorithm can be applied for more general power grid structures. Additionally, we also propose an accuracy-aware GPU performance modeling and optimization framework to automatically obtain the best power grid simulation configurations. Experimental results show that the DC and transient analysis on GPU can achieve more than 25X speedups over the best available CPU-based solvers.

[1]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[2]  Min Zhao,et al.  Power Grid Analysis and Optimization Using Algebraic Multigrid , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Richard W. Vuduc,et al.  Model-driven autotuning of sparse matrix-vector multiply on GPUs , 2010, PPoPP '10.

[4]  Rajendran Panda,et al.  Hierarchical analysis of power distribution networks , 2000, DAC.

[5]  Zhiyu Zeng,et al.  Parallel multigrid preconditioning on graphics processing units (GPUs) for robust power grid analysis , 2010, Design Automation Conference.

[6]  Yici Cai,et al.  GPU friendly Fast Poisson Solver for structured power grid network analysis , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[7]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[8]  David M. Young,et al.  ITPACK project: Past, present, and future , 1984 .

[9]  S. Ashby,et al.  A parallel multigrid preconditioned conjugate gradient algorithm for groundwater flow simulations , 1996 .

[10]  Jiang Hu,et al.  GPU-based parallelization for fast circuit optimization , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[11]  Yi Yang,et al.  A GPGPU compiler for memory optimization and parallelism management , 2010, PLDI '10.

[12]  Robert Strzodka,et al.  Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid , 2011, IEEE Transactions on Parallel and Distributed Systems.

[13]  Wen-mei W. Hwu,et al.  Program optimization space pruning for a multithreaded gpu , 2008, CGO '08.

[14]  Zhuo Feng,et al.  Multigrid on GPU: Tackling Power Grid Analysis on parallel SIMT platforms , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[15]  Martin D. F. Wong,et al.  Fast block-iterative domain decomposition algorithm for IR drop analysis in large power grid , 2010, 2010 11th International Symposium on Quality Electronic Design (ISQED).

[16]  Sani R. Nassif,et al.  Multigrid-like technique for power grid analysis , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[17]  William Gropp,et al.  An adaptive performance modeling tool for GPU architectures , 2010, PPoPP '10.

[18]  Yangdong Deng,et al.  Taming irregular EDA applications on GPUs , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[19]  Charlie Chung-Ping Chen,et al.  Efficient large-scale power grid analysis based on preconditioned Krylov-subspace iterative methods , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[20]  Sani R. Nassif,et al.  Power grid analysis using random walks , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21]  Martin D. F. Wong,et al.  Fast algorithms for IR drop analysis in large power grid , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[22]  YANQING CHEN,et al.  Algorithm 8 xx : CHOLMOD , supernodal sparse Cholesky factorization and update / downdate ∗ , 2006 .

[23]  Hyesoon Kim,et al.  An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.