PsmArena: Partitioned shared memory for NUMA-awareness in multithreaded scientific applications

[1]  Brice Goglin,et al.  Enabling high-performance memory migration for multithreaded applications on LINUX , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[2]  Georg Hager,et al.  Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[3]  Xiaolin Cao,et al.  JASMIN: a parallel software infrastructure for scientific computing , 2010, Frontiers of Computer Science in China.

[4]  Murray Cole,et al.  NUMA Optimizations for Algorithmic Skeletons , 2018, Euro-Par.

[5]  Thomas R. Gross,et al.  (Mis)understanding the NUMA memory system performance of multithreaded workloads , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[6]  Haijing Zhou,et al.  Massively parallel FDTD program JEMS-FDTD and its applications in platform coupling simulation , 2014, 2014 International Symposium on Electromagnetic Compatibility.

[7]  Thomas R. Gross,et al.  A library for portable and composable data locality optimizations for NUMA systems , 2015, PPOPP.

[8]  Zhang Ai-qing A Parallel Module for the Multiblock Structured Mesh in JASMIN and Its Applications , 2012 .

[9]  Jean-François Méhaut,et al.  Charm++ on NUMA Platforms: the impact of SMP Optimizations and a NUMA-aware Load Balancing , 2009 .

[10]  Robert D. Falgout,et al.  hypre: A Library of High Performance Preconditioners , 2002, International Conference on Computational Science.

[11]  John Shalf,et al.  BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework , 2016, SIAM J. Sci. Comput..

[12]  Simon David Hammond,et al.  memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies. , 2015 .

[13]  Aiqing Zhang,et al.  A new parallel algorithm for vertex priorities of data flow acyclic digraphs , 2013, The Journal of Supercomputing.

[14]  Robert Strzodka,et al.  NUMA Aware Iterative Stencil Computations on Many-Core Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[15]  Torsten Hoefler,et al.  NUMA-aware shared-memory collective communication for MPI , 2013, HPDC.

[16]  Thomas Hérault,et al.  PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.

[17]  Jean-François Méhaut,et al.  Improving Memory Affinity of Geophysics Applications on NUMA Platforms Using Minas , 2010, VECPAR.

[18]  Qingyu Meng,et al.  Extending the Uintah Framework through the Petascale Modeling of Detonation in Arrays of High Explosive Devices , 2016, SIAM J. Sci. Comput..

[19]  Gerhard Wellein,et al.  Introduction to High Performance Computing for Scientists and Engineers , 2010, Chapman and Hall / CRC computational science series.

[20]  Brice Goglin,et al.  ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures , 2010, International Journal of Parallel Programming.

[21]  Tarek A. El-Ghazawi,et al.  UPC: unified parallel C , 2006, SC.