Ultra-Scalable CPU-MIC Acceleration of Mesoscale Atmospheric Modeling on Tianhe-2
暂无分享,去创建一个
Chao Yang | Rajiv Ranjan | Lizhe Wang | Wei Xue | Lin Gan | Haohuan Fu | Yutong Lu | Junfeng Liao | Xinliang Wang | Yangtong Xu | L. Gan | H. Fu | R. Ranjan | Lizhe Wang | Yutong Lu | Wei Xue | Junfeng Liao | Chao Yang | Xinliang Wang | Yangtong Xu
[1] Mariana Vertenstein,et al. Computational performance of ultra-high-resolution capability in the Community Earth System Model , 2012, Int. J. High Perform. Comput. Appl..
[2] Satoshi Matsuoka,et al. Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[3] Chao Yang,et al. A Scalable Fully Implicit Compressible Euler Solver for Mesoscale Nonhydrostatic Simulation of Atmospheric Flows , 2014, SIAM J. Sci. Comput..
[4] Chao Yang,et al. A peta-scalable CPU-GPU algorithm for global atmospheric simulations , 2013, PPoPP '13.
[5] Takashi Shimokawabe,et al. 145 TFlops Performance on 3990 GPUs of TSUBAME 2.0 Supercomputer for an Operational Weather Prediction , 2011, ICCS.
[6] William Putman,et al. The finite-volume dynamical core on the cubed-sphere , 2006, SC.
[7] Satoshi Matsuoka,et al. An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Samuel Williams,et al. Implicit and explicit optimizations for stencil computations , 2006, MSPC '06.
[9] Dirk Schmidl,et al. Assessing the Performance of OpenMP Programs on the Intel Xeon Phi , 2013, Euro-Par.
[10] Masaki Satoh,et al. Conservative scheme for the compressible nonhydrostatic models with the horizontally explicit and vertically implicit time integration scheme , 2002 .
[11] Stephen A. Jarvis,et al. Exploring SIMD for Molecular Dynamics , 2013 .
[12] Tom Henderson,et al. Running the NIM Next-Generation Weather Model on GPUs , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[13] Christiane Jablonowski,et al. Operator-Split Runge-Kutta-Rosenbrock Methods for Nonhydrostatic Atmospheric Models , 2012 .
[14] Ricardo Bianchini,et al. Using communication-to-computation ratio in parallel program design and performance prediction , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.
[15] Hamid Jafarkhani,et al. On the computation and reduction of the peak-to-average power ratio in multicarrier communications , 2000, IEEE Trans. Commun..
[16] Diego Rossinelli,et al. High throughput software for direct numerical simulations of compressible two-phase flows , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[17] Alan Norton,et al. Petascale WRF simulation of hurricane sandy: Deployment of NCSA's cray XE6 blue waters , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Williama Putnam. Graphics Processing Unit (GPU) Acceleration of the Goddard Earth Observing System Atmospheric Model , 2011 .
[19] Fan Zhang,et al. Cluster-Size Scaling and MapReduce Execution Times , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.
[20] Giuseppe Coviello,et al. COSMIC: middleware for high performance and reliable multiprocessing on xeon phi coprocessors , 2013, HPDC '13.
[21] Sabela Ramos,et al. Modeling communication in cache-coherent SMP systems: a case-study with Xeon Phi , 2013, HPDC.
[22] Chao Yang,et al. Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2 , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[23] Xing Liu,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors , 2013, ICS '13.
[24] Samuel Williams,et al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[25] Pradeep Dubey,et al. Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[26] Nicholas J. Wright,et al. WRF nature run , 2008 .
[27] Nikolaus A. Adams,et al. 11 PFLOP/s simulations of cloud cavitation collapse , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[28] N. Phillips,et al. Scale Analysis of Deep and Shallow Convection in the Atmosphere , 1962 .
[29] Satoshi Matsuoka,et al. Multi-GPU Implementation of the NICAM Atmospheric Model , 2012, Euro-Par Workshops.
[30] Volker Strumpen,et al. The memory behavior of cache oblivious stencil computations , 2007, The Journal of Supercomputing.
[31] Manish Vachharajani,et al. GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[32] P. Lauritzen. Numerical techniques for global atmospheric models , 2011 .
[33] Matthias Christen,et al. Patus for convenient high-performance stencils: Evaluation in earthquake simulations , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[34] Stephen A. Jarvis,et al. Exploring SIMD for Molecular Dynamics, Using Intel® Xeon® Processors and Intel® Xeon Phi Coprocessors , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[35] Pradeep Dubey,et al. 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[36] Lukasz Szustak Czestochowa,et al. Toward efficient distribution of MPDATA stencil computation on Intel MIC architecture , 2013 .
[37] Mark A. Taylor,et al. Progress towards accelerating HOMME on hybrid multi-core systems , 2013, Int. J. High Perform. Comput. Appl..
[38] Jing Sun,et al. GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model , 2013, Comput. Geosci..
[39] Mikhail Smelyanskiy,et al. Efficient backprojection-based synthetic aperture radar computation with many-core processors , 2012, HiPC 2012.