DeACT: Architecture-Aware Virtual Memory Support for Fabric Attached Memory Systems

The exponential growth of data has driven technology providers to develop new protocols, such as cache coherent interconnects and memory semantic fabrics, to help users and facilities leverage advances in memory technologies to satisfy these growing memory and storage demands. Using these new protocols, fabric-attached memories (FAM) can be directly attached to a system interconnect and be easily integrated with a variety of processing elements (PEs). Moreover, systems that support FAM can be smoothly upgraded and allow multiple PEs to share the FAM memory pools using well-defined protocols. The sharing of FAM between PEs allows efficient data sharing, improves memory utilization, reduces cost by allowing flexible integration of different PEs and memory modules from several vendors, and makes it easier to upgrade the system. One promising use-case for FAMs is in High-Performance Compute (HPC) systems, where the underutilization of memory is a major challenge. However, adopting FAMs in HPC systems brings new challenges. In addition to cost, flexibility, and efficiency, one particular problem that requires rethinking is virtual memory support for security and performance. To address these challenges, this paper presents decoupled access control and address translation (DeACT), a novel virtual memory implementation that supports HPC systems equipped with FAM. Compared to the state-of-the-art two-level translation approach, DeACT achieves speedup of up to 4.59x (1.8x on average) without compromising security.

[1]  Sui Chen,et al.  Architectural Support for NVRAM Persistence in GPUs , 2020, IEEE Transactions on Parallel and Distributed Systems.

[2]  Thomas F. Wenisch,et al.  Disaggregated memory for expansion and sharing in blade servers , 2009, ISCA '09.

[3]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[4]  Tianhao Zhang,et al.  Do-it-yourself virtual memory translation , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[5]  Kang G. Shin,et al.  Efficient Memory Disaggregation with Infiniswap , 2017, NSDI.

[6]  Amro Awad,et al.  Page migration support for disaggregated non-volatile memories , 2019, MEMSYS.

[7]  Uzi Vishkin,et al.  An O(log n) Parallel Connectivity Algorithm , 1982, J. Algorithms.

[8]  Scott Shenker,et al.  Network Requirements for Resource Disaggregation , 2016, OSDI.

[9]  Thomas F. Wenisch,et al.  System-level implications of disaggregated memory , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[10]  Garrick Staples,et al.  TORQUE resource manager , 2006, SC.

[11]  Vivien Quéma,et al.  Large Pages May Be Harmful on NUMA Systems , 2014, USENIX Annual Technical Conference.

[12]  Yan Solihin,et al.  Write-Aware Management of NVM-based Memory Extensions , 2016, ICS.

[13]  Marcos K. Aguilera,et al.  Remote regions: a simple abstraction for remote memory , 2018, USENIX ATC.

[14]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[15]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Yan Solihin,et al.  Avoiding TLB Shootdowns Through Self-Invalidating TLB Entries , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Srilatha Manne,et al.  Accelerating two-dimensional page walks for virtualized systems , 2008, ASPLOS.

[18]  Michael M. Swift,et al.  Agile Paging: Exceeding the Best of Nested and Shadow Paging , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[19]  Keith D. Underwood,et al.  Intel® Omni-path Architecture: Enabling Scalable, High Performance Fabrics , 2015, 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects.

[20]  David A. Wood,et al.  Border control: Sandboxing accelerators , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  Simon D. Hammond,et al.  Performance analysis for using non-volatile memory DIMMs: opportunities and challenges , 2017, MEMSYS.

[22]  Simon David Hammond,et al.  Opal: A Centralized Memory Manager for Investigating Disaggregated Memory Systems. , 2018 .

[23]  David A. Patterson,et al.  The GAP Benchmark Suite , 2015, ArXiv.

[24]  Michael Hamburg,et al.  Meltdown , 2018, meltdownattack.com.

[25]  Mark D. Hill,et al.  Tradeoffs in supporting two page sizes , 1992, ISCA '92.

[26]  Bruce Jacob,et al.  The structural simulation toolkit , 2006, PERV.

[27]  Hakim Weatherspoon,et al.  Shoal: A Network Architecture for Disaggregated Racks , 2019, NSDI.

[28]  Amro Awad,et al.  Samba: A Detailed Memory Management Unit (MMU) for the SST Simulation Framework , 2016 .

[29]  Marcos K. Aguilera,et al.  Remote memory in the age of fast networks , 2017, SoCC.

[30]  Jason Taylor,et al.  Facebook's data center infrastructure: Open compute, disaggregated rack, and beyond , 2015, 2015 Optical Fiber Communications Conference and Exhibition (OFC).

[31]  Chenyang Lu,et al.  RT-Xen: Towards real-time hypervisor scheduling in Xen , 2011, 2011 Proceedings of the Ninth ACM International Conference on Embedded Software (EMSOFT).

[32]  Yiying Zhang,et al.  LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation , 2018, OSDI.

[33]  Amnon Barak,et al.  Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[34]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[35]  Herbert Bos,et al.  Translation Leak-aside Buffer: Defeating Cache Side-channel Protections with TLB Attacks , 2018, USENIX Security Symposium.

[36]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[37]  Jaehyuk Huh,et al.  Revisiting hardware-assisted page walks for virtualized systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[38]  Michael Hamburg,et al.  Spectre Attacks: Exploiting Speculative Execution , 2018, 2019 IEEE Symposium on Security and Privacy (SP).