GPUstore: harnessing GPU computing for storage systems in the OS kernel

Many storage systems include computationally expensive components. Examples include encryption for confidentiality, checksums for integrity, and error correcting codes for reliability. As storage systems become larger, faster, and serve more clients, the demands placed on their computational components increase and they can become performance bottlenecks. Many of these computational tasks are inherently parallel: they can be run independently for different blocks, files, or I/O requests. This makes them a good fit for GPUs, a class of processor designed specifically for high degrees of parallelism: consumer-grade GPUs have hundreds of cores and are capable of running hundreds of thousands of concurrent threads. However, because the software frameworks built for GPUs have been designed primarily for the long-running, data-intensive workloads seen in graphics or high-performance computing, they are not well-suited to the needs of storage systems. In this paper, we present GPUstore, a framework for integrating GPU computing into storage systems. GPUstore is designed to match the programming models already used these systems. We have prototyped GPUstore in the Linux kernel and demonstrate its use in three storage subsystems: file-level encryption, block-level encryption, and RAID 6 data recovery. Comparing our GPU-accelerated drivers with the mature CPU-based implementations in the Linux kernel, we show performance improvements of up to an order of magnitude.

[1]  André Brinkmann,et al.  A microdriver architecture for error correcting codes inside the Linux kernel , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[2]  Jiri Schindler,et al.  Improving Throughput for Small Disk Requests with Proximal I/O , 2011, FAST.

[3]  Akshat Verma,et al.  Shredder: GPU-accelerated incremental storage and computation , 2012, FAST.

[4]  Peter F. Corbett,et al.  Row-Diagonal Parity for Double Disk Failure Correction (Awarded Best Paper!) , 2004, USENIX Conference on File and Storage Technologies.

[5]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[6]  Erez Zadok,et al.  Cryptfs: A Stackable Vnode Level Encryption File System , 1998 .

[7]  Vijay Kumar,et al.  Efficient galois field arithmetic on SIMD architectures , 2003, SPAA '03.

[8]  Mark Silberstein,et al.  PTask: operating system abstractions to manage GPUs as compute devices , 2011, SOSP.

[9]  Wen-mei W. Hwu,et al.  CUDA-Lite: Reducing GPU Programming Complexity , 2008, LCPC.

[10]  Matt Blaze,et al.  A cryptographic file system for UNIX , 1993, CCS '93.

[11]  Shinpei Kato,et al.  Gdev: First-Class GPU Resource Management in the Operating System , 2012, USENIX Annual Technical Conference.

[12]  Willy Zwaenepoel,et al.  IO-Lite: a unified I/O buffering and caching system , 1999, TOCS.

[13]  Anthony Skjellum,et al.  Gibraltar: A Reed‐Solomon coding library for storage applications on programmable graphics processors , 2011, Concurr. Comput. Pract. Exp..

[14]  John E. Stone,et al.  An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.

[15]  Matthew Might,et al.  EigenCFA: accelerating flow analysis with GPUs , 2011, POPL '11.

[16]  Erez Zadok,et al.  I3FS: An In-Kernel Integrity Checker and Intrusion Detection File System , 2004, LISA.

[17]  Scott A. Mahlke,et al.  Sponge: portable stream programming on graphics engines , 2011, ASPLOS XVI.

[18]  Sotiris Ioannidis,et al.  Gnort: High Performance Network Intrusion Detection Using Graphics Processors , 2008, RAID.

[19]  Aditya Kashyap File System Extensibility and Reliability Using an in-Kernel Database , 2004 .

[20]  John Waldron,et al.  GPU Accelerated Cryptography as an OS Service , 2010, Trans. Comput. Sci..

[21]  Scott Watanabe Solaris 10 ZFS Essentials , 2010 .

[22]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..

[23]  Anand Sivasubramaniam,et al.  Evaluating the usefulness of content addressable storage for high-performance data intensive applications , 2008, HPDC '08.

[24]  Seungyeop Han,et al.  SSLShader: Cheap SSL Acceleration with Commodity Processors , 2011, NSDI.

[25]  Kian-Lee Tan,et al.  StegFS: a steganographic file system , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[26]  Jehoshua Bruck,et al.  EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures , 1994, ISCA '94.

[27]  John Waldron,et al.  Practical Symmetric Key Cryptography on Modern Graphics Hardware , 2008, USENIX Security Symposium.

[28]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[29]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[30]  Matei Ripeanu,et al.  A GPU accelerated storage system , 2010, HPDC '10.

[31]  Sean Matthew Dorward,et al.  Awarded Best Paper! - Venti: A New Approach to Archival Data Storage , 2002 .

[32]  Tarek S. Abdelrahman,et al.  hiCUDA: a high-level directive-based language for GPU programming , 2009, GPGPU-2.

[33]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[34]  Cezary Dubnicki,et al.  HydraFS: A High-Throughput File System for the HYDRAstor Content-Addressable Storage System , 2010, FAST.

[35]  Muli Ben-Yehuda,et al.  Tapping into the fountain of CPUs: on operating system support for programmable devices , 2008, ASPLOS.

[36]  Anthony Skjellum,et al.  A Lightweight, GPU-Based Software RAID System , 2010, 2010 39th International Conference on Parallel Processing.