On the effective implementation of a boundary element code on graphics processing units using an out-of-core LU algorithm☆
暂无分享,去创建一个
[1] Tom Davis,et al. Opengl programming guide: the official guide to learning opengl , 1993 .
[2] Robert A. van de Geijn,et al. Anatomy of a Parallel Out-of-Core Dense Linear Solver , 1995, ICPP.
[3] Allen Sherrod,et al. Beginning DirectX 11 Game Programming , 2011 .
[4] Jack Dongarra,et al. The Design and Implementation of the Parallel Out-of-coreScaLAPACK LU, QR, and Cholesky Factorization Routines , 1997 .
[5] Ramani Duraiswami,et al. Fast multipole methods on graphics processors , 2008, J. Comput. Phys..
[6] Jason Sanders,et al. CUDA by example: an introduction to general purpose GPU programming , 2010 .
[7] Jack Dongarra,et al. Key concepts for parallel out-of-core LU factorization , 1998 .
[8] Jack J. Dongarra,et al. Dense linear algebra solvers for multicore with GPU accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[9] Randima Fernando,et al. The CG Tutorial: The Definitive Guide to Programmable Real-Time Graphics , 2003 .
[10] Sivan Toledo,et al. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations , 1996, IOPADS '96.
[11] S. Nintcheu Fata,et al. Explicit expressions for 3D boundary integrals in potential theory , 2009 .
[12] Matthew G. Knepley,et al. Biomolecular electrostatics using a fast multipole BEM on up to 512 gpus and a billion unknowns , 2010, Comput. Phys. Commun..
[13] Sivan Toledo. Locality of Reference in LU Decomposition with Partial Pivoting , 1997, SIAM J. Matrix Anal. Appl..
[14] Jaeyoung Choi,et al. Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..
[15] Timothy G. Mattson,et al. OpenCL Programming Guide , 2011 .
[16] M. Bonnet. Boundary Integral Equation Methods for Solids and Fluids , 1999 .