The pre-exascale systems will soon be deployed with a deep, complex memory hierarchy composed of many heterogeneous memories. This presents multiple challenges for users including: how to allocate data objects with locality between memories and devices for the various memories in these systems, which includes DRAM, High-bandwidth Memory (HBM), and non-volatile random access memory (NVRAM), and how to perform these allocations while providing portability for their application. Currently, the user can make use of multiple, disjoint libraries to allocate data objects on these memories. However, it is difficult to obtain locality between memories and devices when using libraries that are unaware of each other. This paper presents the Unified Memory Allocator (UMA) of the SHARed data-structure centric Programming abstraction (SharP) library, which provides a unified interface for memory allocations across DRAM, HBM, and NVRAM and is extensible to support future memory types. In addition, the SharP UMA allows for portability between systems by supporting both explicit and implicit, intent-based memory allocations. To demonstrate the ease of use of the SharP UMA, we have extended both Open MPIand OpenSHMEM-Xto support SharP. We validate this work by evaluating the performance implications and intent-based approach with synthetic benchmarks as well as adaptations of the Graph500 benchmark.
[1]
Manjunath Gorentla Venkata,et al.
OpenSHMEM-UCX: Evaluation of UCX for Implementing OpenSHMEM Programming Model
,
2016,
OpenSHMEM.
[2]
Neena Imam,et al.
Graph 500 in OpenSHMEM
,
2015,
OpenSHMEM.
[3]
Simon David Hammond,et al.
memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies.
,
2015
.
[4]
Daniel Sunderland,et al.
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
,
2014,
J. Parallel Distributed Comput..
[5]
James Dinan,et al.
Symmetric Memory Partitions in OpenSHMEM: A Case Study with Intel KNL
,
2017,
OpenSHMEM.
[6]
Kathryn S. McKinley,et al.
Hoard: a scalable memory allocator for multithreaded applications
,
2000,
SIGP.
[7]
Manjunath Gorentla Venkata,et al.
SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory
,
2017,
2017 46th International Conference on Parallel Processing Workshops (ICPPW).
[8]
Michael Lang,et al.
UNITY: Unified Memory and File Space
,
2017,
ROSS@HPDC.
[9]
Message P Forum,et al.
MPI: A Message-Passing Interface Standard
,
1994
.