Low-Order Finite Element Solver with Small Matrix-Matrix Multiplication Accelerated by AI-Specific Hardware for Crustal Deformation Computation
暂无分享,去创建一个
Tjerk P. Straatsma | Tsuyoshi Ichimura | Muneo Hori | Kohei Fujita | Lalith Maddegedara | Christopher J. Zimmer | Takuma Yamaguchi | Naonori Ueda | Akira Naruse | Jack C. Wells | T. Straatsma | N. Ueda | J. Wells | M. Hori | T. Ichimura | K. Fujita | Takuma Yamaguchi | Lalith Maddegedara | Akira Naruse
[1] Thomas J. R. Hughes,et al. Solution algorithms for nonlinear transient heat conduction analysis employing element-by-element iterative strategies , 1985 .
[2] Ian Parsons,et al. Surface deformation due to shear and tensile faults in a half-space , 1986 .
[3] Yousef Saad,et al. A Flexible Inner-Outer Preconditioned GMRES Algorithm , 1993, SIAM J. Sci. Comput..
[4] Gene H. Golub,et al. Inexact Preconditioned Conjugate Gradient Method with Inner-Outer Iteration , 1999, SIAM J. Sci. Comput..
[5] T. Masterlark. Finite element model predictions of static deformation from dislocation sources in a subduction zone: Sensitivities to homogeneous, isotropic, Poisson-solid, and half-space assumptions , 2003 .
[6] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[7] Chihiro Hashimoto,et al. 3-D Modelling of Plate Interfaces and Numerical Simulation of Long-term Crustal Deformation in and around Japan , 2004 .
[8] Yukitoshi Fukahata,et al. General expressions for internal deformation fields due to a dislocation source in a multilayered elastic half-space , 2005 .
[9] Takeji Kometani. GPS Earth Observation Network System , 2005 .
[10] Tsuyoshi Ichimura,et al. Earthquake Motion Simulation with Multiscale Finite-Element Analysis on Hybrid Grid , 2007 .
[11] John Z. Lou,et al. Geophysical Finite-Element Simulation Tool (GeoFEST): Algorithms and Validation for Quasistatic Regional Faulted Crust Problems , 2008 .
[12] Paulius Micikevicius,et al. 3D finite difference computation on GPUs using CUDA , 2009, GPGPU-2.
[13] Kipton Barros,et al. Solving lattice QCD systems of equations using mixed precision solvers on GPUs , 2009, Comput. Phys. Commun..
[14] Jack J. Dongarra,et al. Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..
[15] Walter D. Mooney,et al. Poroelastic stress-triggering of the 2005 M8.7 Nias earthquake by the 2004 M9.2 Sumatra–Andaman earthquake , 2010 .
[16] Christian Bignami,et al. Coseismic slip distribution for the Mw 9 2011 Tohoku‐Oki earthquake derived from 3‐D FE modeling , 2013 .
[17] James L. Beck,et al. Bayesian inversion for finite fault earthquake source models I—theory and algorithm , 2013 .
[18] Tsuyoshi Ichimura,et al. Physics-Based Urban Earthquake Simulation Enhanced by 10.7 BlnDOF × 30 K Time-Step Unstructured FE Non-Linear Seismic Wave Simulation , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Constantine Bekas,et al. An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth's mantle , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[20] Chetan Jhurani,et al. A GEMM interface and implementation on NVIDIA GPUs for multiple small matrices , 2013, J. Parallel Distributed Comput..
[21] Pher Errol Balde Quinay,et al. Implicit nonlinear wave simulation with 1.08T DOF and 0.270T unstructured finite elements to enhance comprehensive earthquake simulation , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[22] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[23] Ronald M. Summers,et al. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.
[24] Jack J. Dongarra,et al. High-Performance Tensor Contractions for GPUs , 2016, ICCS.
[25] Ole Sigmund,et al. Giga-voxel computational morphogenesis for structural design , 2017, Nature.
[26] Timothy A. Davis,et al. Algorithm 9xx: Sparse QR Factorization on the GPU , 2015 .
[27] Tsuyoshi Ichimura,et al. Fast and Scalable Low-Order Implicit Unstructured Finite-Element Solver for Earth's Crust Deformation Problem , 2017, PASC.
[28] Marco Maggioni,et al. Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking , 2018, ArXiv.
[29] Tjerk P. Straatsma,et al. A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[30] Yuri Fialko,et al. Observations and Modeling of Coseismic and Postseismic Deformation Due To the 2015 Mw 7.8 Gorkha (Nepal) Earthquake , 2018 .
[31] Nicholas J. Higham,et al. Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[32] Jack J. Dongarra,et al. The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques , 2018, ICCS.
[33] Nicholas J. Higham,et al. Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions , 2018, SIAM J. Sci. Comput..
[34] Jeffrey S. Vetter,et al. NVIDIA Tensor Core Programmability, Performance & Precision , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[35] Jack Dongarra,et al. Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[36] Tor M. Aamodt,et al. Modeling Deep Learning Accelerator Enabled GPUs , 2018, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).