Two-level parallelization of a fluid mechanics algorithm exploiting hardware heterogeneity
暂无分享,去创建一个
[1] Timothy C. Warburton,et al. Nodal discontinuous Galerkin methods on graphics processors , 2009, J. Comput. Phys..
[2] S. Sherwin,et al. From h to p efficiently: optimal implementation strategies for explicit time-dependent problems using the spectral/hp element method , 2014, International journal for numerical methods in fluids.
[3] N. Peters,et al. Discussion of Test Problem A , 1982 .
[4] Claude Basdevant,et al. Optimizing 2D and 3D structured Euler CFD solvers on Graphical Processing Units , 2012 .
[5] Jack Dongarra,et al. Hydrodynamic Computation with Hybrid Programming on CPU-GPU Clusters , 2013 .
[6] Rupak Biswas,et al. High performance computing using MPI and OpenMP on multi-core parallel systems , 2011, Parallel Comput..
[7] Jochen Fröhlich,et al. An improved immersed boundary method with direct forcing for the simulation of particle laden flows , 2012, J. Comput. Phys..
[8] John Shalf,et al. The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..
[9] Christoph W. Kessler,et al. SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.
[10] Satoshi Matsuoka,et al. CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.
[11] T. Poinsot,et al. Theoretical and numerical combustion , 2001 .
[12] A. Patera. A spectral element method for fluid dynamics: Laminar flow in a channel expansion , 1984 .
[13] G. Karniadakis,et al. Spectral/hp Element Methods for CFD , 1999 .
[14] Kim M. Hazelwood,et al. Where is the data? Why you cannot debate CPU vs. GPU performance without the answer , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[15] Robert Strzodka,et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster , 2007, Parallel Comput..
[16] Rolf Dach,et al. Technical Report 2012 , 2013 .
[17] Willem Hundsdorfer,et al. Partially Implicit BDF2 Blends for Convection Dominated Flows , 2000, SIAM J. Numer. Anal..
[18] Yi Jiang,et al. Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer , 2014, J. Comput. Phys..
[19] Boris Štok,et al. Parallel computing with load balancing on heterogeneous distributed systems , 2003 .
[20] Alejandro Duran,et al. Productive Programming of GPU Clusters with OmpSs , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[21] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[22] Anne E. Trefethen,et al. Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems , 2013, Parallel Comput..
[23] P. Fischer,et al. High-Order Methods for Incompressible Fluid Flow , 2002 .
[24] Gordon Erlebacher,et al. High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster , 2010, J. Comput. Phys..