Providing safe, user space access to fast, solid state disks

Emerging fast, non-volatile memories (e.g., phase change memories, spin-torque MRAMs, and the memristor) reduce storage access latencies by an order of magnitude compared to state-of-the-art flash-based SSDs. This improved performance means that software overheads that had little impact on the performance of flash-based systems can present serious bottlenecks in systems that incorporate these new technologies. We describe a novel storage hardware and software architecture that nearly eliminates two sources of this overhead: Entering the kernel and performing file system permission checks. The new architecture provides a private, virtualized interface for each process and moves file system protection checks into hardware. As a result, applications can access file data without operating system intervention, eliminating OS and file system costs entirely for most accesses. We describe the support the system provides for fast permission checks in hardware, our approach to notifying applications when requests complete, and the small, easily portable changes required in the file system to support the new access model. Existing applications require no modification to use the new interface. We evaluate the performance of the system using a suite of microbenchmarks and database workloads and show that the new interface improves latency and bandwidth for 4 KB writes by 60% and 7.2x, respectively, OLTP database transaction throughput by up to 2.0x, and Berkeley-DB throughput by up to 5.7x. A streamlined asynchronous file IO interface built to fully utilize the new interface enables an additional 5.5x increase in throughput with 1 thread and 2.8x increase in efficiency for 512 B transfers.

[1]  He Liu,et al.  Click Trajectories: End-to-End Analysis of the Spam Value Chain , 2011, 2011 IEEE Symposium on Security and Privacy.

[2]  Derek McAuley,et al.  Standardized But Flexible I/O for Self-Virtualizing Devices , 2008, Workshop on I/O Virtualization.

[3]  Yuanyuan Zhou,et al.  Experiences with VI communication for database storage , 2002, ISCA.

[4]  Alan L. Cox,et al.  Optimizing network virtualization in Xen , 2006 .

[5]  Kai Li,et al.  Protected, user-level DMA for the SHRIMP network interface , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.

[6]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[7]  Karsten Schwan,et al.  High performance and scalable I/O virtualization via self-virtualized devices , 2007, HPDC '07.

[8]  Muli Ben-Yehuda,et al.  Scalable I/O - A Well-Architected Way to Do Scalable, Secure and Virtualized I/O , 2008, Workshop on I/O Virtualization.

[9]  Al Davis,et al.  Design trade-offs for user-level I/O architectures , 2006, IEEE Transactions on Computers.

[10]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[11]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[12]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13]  Jeremy Sugerman,et al.  GPU virtualization on VMware's hosted I/O architecture , 2008, OPSR.

[14]  Alan L. Cox,et al.  Concurrent Direct Network Access for Virtual Machine Monitors , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[15]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[16]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[17]  Tal Garfinkel,et al.  Virtual machine monitors: current technology and future trends , 2005, Computer.

[18]  Qin Zheng,et al.  DART — A Low Overhead ATM Network Interface Chip , 1996 .

[19]  Rajesh K. Gupta,et al.  Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[20]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[21]  Peter A. Dinda,et al.  Investigating virtual passthrough I/O on commodity devices , 2009, OPSR.

[22]  Eran Gabber,et al.  The Case Against User-Level Networking , 2004 .

[23]  Derek McAuley,et al.  A case for virtual channel processors , 2003, NICELI '03.

[24]  Al Davis,et al.  Improving I/O performance with a conditional store buffer , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[25]  Jiuxing Liu,et al.  Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization , 2009, ICS.

[26]  Milon Mackey,et al.  An implementation of the Hamlyn sender-managed interface architecture , 1996, OSDI '96.

[27]  Mendel Rosenblum,et al.  I/O Virtualization , 2011 .

[28]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.