S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems

Parallel file systems (PFS) are widely-used in modern computing systems to mask the ever-increasing performance gap between computing and data access. PFSs favor large requests, and do not work well for small requests, especially small random requests. Newer Solid State Drives (SSD) have excellent performance on small random data accesses, but also incur a high monetary cost. In this study, we propose a hybrid architecture named the Smart Selective SSD Cache (S4D-Cache), which employs a small set of SSD-based file servers as a selective cache of conventional HDD-based file servers. A novel scheme is introduced to identify performance-critical data, and conduct selective cache admission to fully utilize the hybrid architecture in terms of data-access parallelism and randomness. We have implemented an S4D-Cache under the MPI-IO and PVFS2 parallel file system. Our experiments show that S4D-Cache can significantly improve I/O throughput, and is a promising approach for parallel applications.

[1]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[2]  Song Jiang,et al.  A Scheduling Framework That Makes Any Disk Schedulers Non-Work-Conserving Solely Based on Request Characteristics , 2011, FAST.

[3]  Kang G. Shin,et al.  FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[4]  Song Jiang,et al.  iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[5]  Robert B. Ross,et al.  Efficient structured data access in parallel file systems , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[6]  Qing Yang,et al.  I-CASH: Intelligently Coupled Array of SSD and HDD , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[7]  Feng Chen,et al.  Hystor: making the best use of solid state drives in high performance storage systems , 2011, ICS '11.

[8]  Rajeev Thakur,et al.  Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[9]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[10]  Surendra Byna,et al.  Boosting Application-Specific Parallel I/O Optimization Using IOSIG , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[11]  Surendra Byna,et al.  Parallel I/O prefetching using MPI file caching and I/O signatures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Bronis R. de Supinski,et al.  Design, Modeling, and Evaluation of a Scalable Multi-level Checkpointing System , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  Wei-keng Liao,et al.  Evaluating I/O characteristics and methods for storing structured scientific data , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[14]  Wei-keng Liao,et al.  Collective caching: application-aware client-side file caching , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[15]  Robert B. Ross,et al.  Noncontiguous I/O accesses through MPI-IO , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[16]  Xin Huang,et al.  A cost-aware region-level data placement scheme for hybrid parallel I/O systems , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[17]  Song Jiang,et al.  iBridge: Improving Unaligned Parallel File Access with Solid-State Drives , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[18]  Robert B. Ross,et al.  On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[19]  Mahmut T. Kandemir,et al.  Improving I/O Performance of Applications through Compiler-Directed Code Restructuring , 2008, FAST.

[20]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[21]  Marianne Winslett,et al.  Improving MPI-IO output performance with active buffering plus threads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[22]  Mithuna Thottethodi,et al.  SieveStore: a highly-selective, ensemble-level disk cache for cost-performance , 2010, ISCA '10.

[23]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[24]  David R. Kaeli,et al.  Profile-guided I/O partitioning , 2003, ICS '03.

[25]  C. Kirsch Combo Drive : Optimizing Cost and Performance in a Heterogeneous Storage Device , 2009 .

[26]  Ibrahim F. Haddad,et al.  PVFS: A Parallel Virtual File System for Linux Clusters , 2000 .

[27]  Scott A. Brandt,et al.  Reducing Hybrid Disk Write Latency with Flash-Backed I/O Requests , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[28]  Geoffrey H. Kuenning,et al.  Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System , 2002, USENIX Annual Technical Conference, General Track.