论文信息 - GPU-based Arnoldi factorisation for accelerating finite element eigenanalysis

GPU-based Arnoldi factorisation for accelerating finite element eigenanalysis

We present a GPU-accelerated implementation of the k-step Arnoldi factorisation [1] that forms the basis of a number of iterative eigenvalue system solvers. These solvers are important for the finite element analysis of the cutoff and dispersion characteristics of waveguide structures as well as cavity resonances [2] and since they contribute significantly to the runtime in computing a solution, their acceleration is of interest. The initial GPU-based implementation makes use of accelerated BLAS [3] routines for the CUDA API from NVIDIA (cublas) [4]. This allows us to utilise the computational power of the GPU at a functional level as a proof of concept with minimal coding effort. The implementation is then refined to make use of enhancements to the matrix-vector multiplication routines proposed by Fujimoto in [5] further improving performance.

D. B. Davidson | E. Lezar | D. Davidson | E. Lezar

[1] Carretera de Valencia,et al. The finite element method in electromagnetics , 2000 .

[2] Jack J. Dongarra,et al. Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..

[3] N. Fujimoto,et al. Faster matrix-vector multiplication on GeForce 8800GTX , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[4] Rajesh Bordawekar,et al. Optimizing Sparse Matrix-Vector Multiplication on GPUs using Compile-time and Run-time Strategies , 2008 .

[5] Jens H. Krüger,et al. GPGPU: general purpose computation on graphics hardware , 2004, SIGGRAPH '04.

[6] Charles L. Lawson,et al. Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[7] Jack J. Dongarra,et al. A Note on Auto-tuning GEMM for GPUs , 2009, ICCS.

[8] Gene H. Golub,et al. Matrix computations (3rd ed.) , 1996 .

[9] Howard C. Reader,et al. Understanding Microwave Heating Cavities , 2000 .

[10] James Demmel,et al. Benchmarking GPUs to tune dense linear algebra , 2008, HiPC 2008.

[11] Jack Dongarra,et al. Numerical Linear Algebra for High-Performance Computers , 1998 .

[12] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[13] Jack Dongarra,et al. Templates for the Solution of Algebraic Eigenvalue Problems , 2000, Software, environments, tools.

[14] Michael Garland,et al. Eﬃcient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[15] D. Davidson. Computational Electromagnetics for RF and Microwave Engineering: The method of moments and stratified media: theory , 2005 .

[16] D. Pozar. Microwave Engineering , 1990 .

[17] Yu Zhu,et al. Multigrid Finite Element Methods for Electromagnetic Field Modeling , 2006 .

[18] James Demmel,et al. LAPACK Users' Guide, Third Edition , 1999, Software, Environments and Tools.

[19] J. Demmel,et al. Using GPUs to Accelerate the Bisection Algorithm for Finding Eigenvalues of Symmetric Tridiagonal Matrices , 2007 .