Lightweight process migration and memory prefetching in openMosix

We propose a lightweight process migration mechanism and an adaptive memory prefetching scheme called AMPoM (adaptive memory prefetching in openMosix), whose goal is to reduce the migration freeze time in openMosix while ensuring the execution efficiency of migrants. To minimize the freeze time, our system transfers only a few pages to the destination node during process migration. After the migration, AMPoM analyzes the spatial locality of memory access and iteratively prefetches memory pages from remote to hide the latency of inter-node page faults. AMPoM adopts a unique algorithm to decide which and how many pages to prefetch. It tends to prefetch more aggressively when a sequential access pattern is developed, when the paging rate of the process is high or when the network is busy. This advanced strategy makes AMPoM highly adaptive to different application behaviors and system dynamics. The HPC Challenge benchmark results show that AMPoM can avoid 98% of migration freeze time while preventing 85-99% of page fault requests after the migration. Compared to openMosix which does not have remote page fault, AMPoM induces a modest overhead of 0-5% additional runtime. When the working set of a migrant is small, AMPoM outperforms openMosix considerably due to the reduced amount of data transfer. These results indicate that by exploiting memory access locality and prefetching, process migration can be a lightweight operation with little software overhead in remote paging.

[1]  Marvin Theimer,et al.  Preemptable remote execution facilities for the V-system , 1985, SOSP '85.

[2]  Marianne Shaw,et al.  Constructing Services with Interposable Virtual Hardware , 2004, NSDI.

[3]  Dejan S. Milojicic,et al.  Process vs. task migration , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[4]  Weiping Zhu,et al.  Implementation of process migration in Amoeba , 1994, 14th International Conference on Distributed Computing Systems.

[5]  Roy H. Campbell,et al.  Fast dynamic process migration , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[6]  Federico Ruggieri The Datagrid Project , 2001 .

[7]  Peter J. Denning,et al.  Working Sets Past and Present , 1980, IEEE Transactions on Software Engineering.

[8]  Mahadev Satyanarayanan,et al.  Internet suspend/resume , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[9]  Jennifer M. Murphy,et al.  The Measurement of Locality and the Behaviour of Programs , 1984, Comput. J..

[10]  Jonathan Walpole,et al.  MIST: PVM with Transparent Migration and Checkpointing , 1995 .

[11]  Mor Harchol-Balter,et al.  Exploiting process lifetime distributions for dynamic load balancing , 1995, SIGMETRICS.

[12]  Monica S. Lam,et al.  Optimizing the migration of virtual computers , 2002, OPSR.

[13]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[14]  Erich Strohmaier,et al.  Quantifying Locality In The Memory Access Patterns of HPC Applications , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[15]  Jason Nieh,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation , 2022 .

[16]  Dejan S. Milojicic,et al.  Task Migration on the top of the Mach Microkernel , 1993, USENIX MACH Symposium.

[17]  Barton P. Miller,et al.  Process migration in DEMOS/MP , 1983, SOSP '83.

[18]  Amnon Barak,et al.  Mos: A multicomputer distributed operating system , 1985, Softw. Pract. Exp..

[19]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[20]  Jack Dongarra,et al.  Introduction to the HPCChallenge Benchmark Suite , 2004 .

[21]  George G. Robertson,et al.  Accent: A communication oriented network operating system kernel , 1981, SOSP.