论文信息 - Challenges of Memory Management on Modern NUMA System

Challenges of Memory Management on Modern NUMA System

Modern server-class systems are typically built as several multicore chips put together in a single system. Each chip has a local DRAM (dynamic random-access memory) module; together they are referred to as a node. Nodes are connected via a high-speed interconnect, and the system is fully coherent. This means that, transparently to the programmer, a core can issue requests to its node’s local memory as well as to the memories of other nodes. The key distinction is that remote requests will take longer, because they are subject to longer wire delays and may have to jump several hops as they traverse the interconnect. The latency of memory-access times is hence non-uniform, because it depends on where the request originates and where it is destined to go. Such systems are referred to as NUMA (non-uniform memory access).

[1] Bruce A. Draper,et al. The CSU Face Identification Evaluation System , 2005, Machine Vision and Applications.

[2] Christoph Lameter,et al. An overview of non-uniform memory access , 2013, CACM.

[3] Vivien Quéma,et al. Traffic management: a holistic approach to memory placement on NUMA systems , 2013, ASPLOS '13.

[4] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[5] Tudor David,et al. Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[6] Yang Zhang,et al. Corey: An Operating System for Many Cores , 2008, OSDI.

[7] Tim Brecht,et al. On the importance of parallel application placement in NUMA multiprocessors , 1993 .