NUMA abilities such as explicit migration of memory buffers enable flexible placement of data buffers at runtime near the tasks that actually access them. The move_pages system call may be invoked manually but it achieves limited throughput and implies a strong collaboration of the application. Indeed, the location of threads and their memory access patterns must be carefully known so as to decide when migrating the right memory buffer on time. We present the implementation of a Next-Touch memory placement policy so as to enable automatic dynamic migration of pages when they are actually accessed by a task. We introduce a new PTE flag setup by madvise, and the corresponding Copy-on-Touch codepath in the page-fault handler which allocates the new page near the accessing task. We then look at the performance and overheads of this model and compare it to using the move_pages system call.
[1]
Dirk Schmidl,et al.
Data and thread affinity in openmp programs
,
2008,
MAW '08.
[2]
Samuel Thibault,et al.
A Flexible Thread Scheduler for Hierarchical Multiprocessor Machines
,
2005,
ArXiv.
[3]
Tim Brecht,et al.
On the importance of parallel application placement in NUMA multiprocessors
,
1993
.
[4]
Christoph Lameter,et al.
Local and Remote Memory: Memory in a Linux/NUMA System
,
2006
.
[5]
Alistair P. Rendell,et al.
OpenMP and NUMA Architectures I: Investigating Memory Placement on the SCI Origin 3000
,
2003,
International Conference on Computational Science.
[6]
Sverker Holmgren,et al.
affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system
,
2005,
ICS '05.