CPU–GPU hybrid parallel strategy for cosmological simulations
暂无分享,去创建一个
Yong Dou | Song Guo | Yuanwu Lei | Yueqing Wang | Dan Zou | Y. Dou | Yueqing Wang | Yuanwu Lei | Dan Zou | Song Guo
[1] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..
[2] S. D. Hammond,et al. Performance Analysis of a Hybrid MPI / CUDA Implementation of the NAS-LU Benchmark , 2010 .
[3] Piet Hut,et al. A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.
[4] R. Teyssier,et al. REIONIZATION SIMULATIONS POWERED BY GRAPHICS PROCESSING UNITS. I. ON THE STRUCTURE OF THE ULTRAVIOLET RADIATION FIELD , 2010, 1004.2503.
[5] Stephen A. Jarvis,et al. Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark , 2011, PERV.
[6] Wen-mei W. Hwu,et al. MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs , 2008, LCPC.
[7] Jérémie Allard,et al. Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations , 2010, Euro-Par.
[8] Michael D. McCool,et al. Metaprogramming GPUs with Sh , 2004 .
[9] Matthias Teschner,et al. A Parallel SPH Implementation on Multi‐Core CPUs , 2011, Comput. Graph. Forum.
[10] Inanc Senocak,et al. An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters , 2010 .
[11] Richard W. Vuduc,et al. A massively parallel adaptive fast-multipole method on heterogeneous architectures , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[12] V. Springel,et al. GADGET: a code for collisionless and gasdynamical cosmological simulations , 2000, astro-ph/0003162.
[13] Makoto Taiji,et al. 42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[14] Edinburgh,et al. Simulating the joint evolution of quasars, galaxies and their large-scale distribution , 2005, astro-ph/0504097.
[15] V. Springel. The Cosmological simulation code GADGET-2 , 2005, astro-ph/0505010.
[16] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.
[17] Eduard Ayguadé,et al. An Extension of the StarSs Programming Model for Platforms with Multiple GPUs , 2009, Euro-Par.
[18] Lexing Ying,et al. A massively parallel adaptive fast-multipole method on heterogeneous architectures , 2009, SC.
[19] Mark Baker,et al. MPJ Express Meets Gadget: Towards a Java Code for Cosmological Simulations , 2006, PVM/MPI.
[20] Laxmikant V. Kalé,et al. Scaling Hierarchical N-body Simulations on GPU Clusters , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.