论文信息 - PsmArena: Partitioned shared memory for NUMA-awareness in multithreaded scientific applications - 字舞流文

PsmArena: Partitioned shared memory for NUMA-awareness in multithreaded scientific applications

Zeyao Mo | Zhang Yang | Aiqing Zhang | Z. Mo | Aiqing Zhang | Zhang Yang

[1] Brice Goglin,et al. Enabling high-performance memory migration for multithreaded applications on LINUX , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2] Georg Hager,et al. Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[3] Xiaolin Cao,et al. JASMIN: a parallel software infrastructure for scientific computing , 2010, Frontiers of Computer Science in China.

[4] Murray Cole,et al. NUMA Optimizations for Algorithmic Skeletons , 2018, Euro-Par.

[5] Thomas R. Gross,et al. (Mis)understanding the NUMA memory system performance of multithreaded workloads , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[6] Haijing Zhou,et al. Massively parallel FDTD program JEMS-FDTD and its applications in platform coupling simulation , 2014, 2014 International Symposium on Electromagnetic Compatibility.

[7] Thomas R. Gross,et al. A library for portable and composable data locality optimizations for NUMA systems , 2015, PPOPP.

[8] Zhang Ai-qing. A Parallel Module for the Multiblock Structured Mesh in JASMIN and Its Applications , 2012 .

[9] Jean-François Méhaut,et al. Charm++ on NUMA Platforms: the impact of SMP Optimizations and a NUMA-aware Load Balancing , 2009 .

[10] Robert D. Falgout,et al. hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[11] John Shalf,et al. BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework , 2016, SIAM J. Sci. Comput..

[12] Simon David Hammond,et al. memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies. , 2015 .

[13] Aiqing Zhang,et al. A new parallel algorithm for vertex priorities of data flow acyclic digraphs , 2013, The Journal of Supercomputing.

[14] Robert Strzodka,et al. NUMA Aware Iterative Stencil Computations on Many-Core Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[15] Torsten Hoefler,et al. NUMA-aware shared-memory collective communication for MPI , 2013, HPDC.

[16] Thomas Hérault,et al. PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.

[17] Jean-François Méhaut,et al. Improving Memory Affinity of Geophysics Applications on NUMA Platforms Using Minas , 2010, VECPAR.

[18] Qingyu Meng,et al. Extending the Uintah Framework through the Petascale Modeling of Detonation in Arrays of High Explosive Devices , 2016, SIAM J. Sci. Comput..

[19] Gerhard Wellein,et al. Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.

[20] Brice Goglin,et al. ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures , 2010, International Journal of Parallel Programming.

[21] Tarek A. El-Ghazawi,et al. UPC: unified parallel C , 2006, SC.