Parallel Implementation and Optimization of Regional Ocean Modeling System (ROMS) Based on Sunway SW26010 Many-Core Processor
暂无分享,去创建一个
Meihong Yang | Yuan Zhuang | Tao Liu | Ying Guo | Jingshan Pan | Min Tian | Yunhui Zeng
[1] Alexander F. Shchepetkin,et al. The regional oceanic modeling system (ROMS): a split-explicit, free-surface, topography-following-coordinate oceanic model , 2005 .
[2] Nathan R. Tallent,et al. HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..
[3] Maria Pantoja,et al. Enhancing regional ocean modeling simulation performance with the Xeon Phi architecture , 2017, OCEANS 2017 - Aberdeen.
[4] Wei Ge,et al. The Sunway TaihuLight supercomputer: system and applications , 2016, Science China Information Sciences.
[5] Xin Liu,et al. A Highly Effective Global Surface Wave Numerical Simulation with Ultra-High Resolution , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Alejandro Duran,et al. The Intel® Many Integrated Core Architecture , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).
[7] Weiguo Liu,et al. 18.9-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of 18-Hz and 8-Meter Scenarios , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.
[8] Su Wang,et al. A Hybrid Parallel Genetic Algorithm with Dynamic Migration Strategy Based on Sunway Many-Core Processor , 2017, 2017 IEEE 19th International Conference on High Performance Computing and Communications Workshops (HPCCWS).
[9] Yan Zhang,et al. A customized GPU acceleration of the princeton ocean model , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.
[10] Hui Lv,et al. Cooperative Computing Techniques for a Deeply Fused and Heterogeneous Many-Core Processor Architecture , 2015, Journal of Computer Science and Technology.
[11] Peter Marwedel,et al. Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).
[12] Erik Lindholm,et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.
[13] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[14] Changsheng Chen,et al. An Unstructured Grid, Finite-Volume, Three-Dimensional, Primitive Equations Ocean Model: Application to Coastal Ocean and Estuaries , 2003 .
[15] Rainer Bleck,et al. An oceanic general circulation model framed in hybrid isopycnic-Cartesian coordinates , 2002 .
[16] Lei Zhao,et al. A Novel Acceleration Method for DGTD Algorithm on Sunway TaihuLight , 2018, 2018 IEEE Asia-Pacific Conference on Antennas and Propagation (APCAP).
[17] Guangwen Yang,et al. Improving the scalability of the ocean barotropic solver in the community earth system model , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[18] Interner Bericht. VAMPIR: Visualization and Analysis of MPI Resources , 1996 .
[19] Allen D. Malony,et al. The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..
[20] Chao Yang,et al. Enabling Highly Efficient k-Means Computations on the SW26010 Many-Core Processor of Sunway TaihuLight , 2019, Journal of Computer Science and Technology.
[21] Cecelia DeLuca,et al. The architecture of the Earth System Modeling Framework , 2003, Computing in Science & Engineering.
[22] Shun Xu,et al. Accelerating Lattice QCD on Sunway Many-Core Processor , 2018, 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom).
[23] R. C. Malone,et al. Parallel ocean general circulation modeling , 1992 .
[24] Bo Li,et al. PFSI.sw: A programming framework for sea ice model algorithms based on Sunway many-core processor , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[25] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .
[26] Tao Wu,et al. Optimization of parallel program based on lattice BGK method , 2019, ACM TUR-C.
[27] Chris Lupo,et al. High performance regional ocean modeling with GPU acceleration , 2013, 2013 OCEANS - San Diego.
[28] Chao Yang,et al. Performance Optimization of the HPCG Benchmark on the Sunway TaihuLight Supercomputer , 2018, ACM Trans. Archit. Code Optim..
[29] Xu Ping,et al. 10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics , 2016 .
[30] A. Blumberg,et al. A Description of a Three‐Dimensional Coastal Ocean Circulation Model , 2013 .
[31] Christian Terboven,et al. OpenACC - First Experiences with Real-World Applications , 2012, Euro-Par.
[32] Srikanth Yalavarthi,et al. An early experience of regional ocean modelling on intel many integrated core architecture , 2014, 2014 21st International Conference on High Performance Computing (HiPC).