A Transparent Server-Managed Object Storage System for HPC

On the road to exascale, the high-performance computing (HPC) community is seeing the emergence of multi-tier storage systems. However, existing data management solutions for HPC applications are no longer suitable for handling the increased level of storage complexity and currently delegate that task back to the user. We describe a novel object-based data abstraction that takes advantage of deep memory hierarchies by providing a simplified programming interface that enables autonomous, asynchronous, and transparent data movement with a server-driven architecture. Users can define a mapping between the application memory and abstract storage objects, creating a linkage between either all or part of an object's content without data copy or transfer, avoiding explicit management of complex data movement across multiple storage hierarchies. We evaluate our system by storing plasma physics simulation data with different storage layouts.

[1]  Michael Lang,et al.  Entering the petaflop era: The architecture and performance of Roadrunner , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Peter Braam,et al.  The Lustre Storage Architecture , 2019, ArXiv.

[3]  Philip H. Carns,et al.  Could Blobs Fuel Storage-Based Convergence Between HPC and Big Data? , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[4]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[5]  Quincey Koziol,et al.  DAOS for Extreme-scale Systems in Scientific Applications , 2017, ArXiv.

[6]  Michael Lang,et al.  UNITY: Unified Memory and File Space , 2017, ROSS@HPDC.

[7]  Ronald Morrison,et al.  A generic persistent object store , 1992, Softw. Eng. J..

[8]  K. Bowers,et al.  Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulationa) , 2008 .

[9]  Houjun Tang,et al.  SoMeta: Scalable Object-Centric Metadata Management for High Performance Computing , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[10]  Stephen R. Walli The POSIX family of standards , 1995, STAN.

[11]  Ronald Morrison,et al.  Persistent object management system , 1984, Softw. Pract. Exp..

[12]  Ibrahim F. Haddad,et al.  PVFS: A Parallel Virtual File System for Linux Clusters , 2000 .

[13]  Gerd Heber,et al.  An overview of the HDF5 technology suite and its applications , 2011, AD '11.

[14]  Emmanuel Jeannot,et al.  TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[15]  John Shalf,et al.  Triple-A: a Non-SSD based autonomic all-flash array for high performance storage systems , 2014, ASPLOS.

[16]  Nicholas Mills,et al.  OrangeFS : Advancing PVFS , 2011 .

[17]  Michael Lang,et al.  Enabling composite applications through an asynchronous shared memory interface , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[18]  Houjun Tang,et al.  Toward Scalable and Asynchronous Object-Centric Data Management for HPC , 2018, 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[19]  Robert B. Ross,et al.  Mercury: Enabling remote procedure call for high-performance computing , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[20]  Carlos Maltzahn,et al.  DAOS and Friends: A Proposal for an Exascale Storage System , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[21]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[22]  H. Pritchard,et al.  The GNI Provider Layer for OFI libfabric , 2016 .

[23]  Alessandro Curioni,et al.  Rebasing I/O for Scientific Computing: Leveraging Storage Class Memory in an IBM BlueGene/Q Supercomputer , 2014, ISC.

[24]  William I. Nowicki,et al.  NFS: Network File System Protocol specification , 1989, RFC.

[25]  Arie Shoshani,et al.  Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks , 2014, Concurr. Comput. Pract. Exp..

[26]  Surendra Byna,et al.  Taming parallel I/O complexity with auto-tuning , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[27]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[28]  Jesús Montes,et al.  Týr: Blob Storage Meets Built-In Transactions , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[29]  Joe Arnold,et al.  OpenStack Swift: Using, Administering, and Developing for Swift Object Storage , 2014 .

[30]  Robert B. Ross,et al.  Improving I/O Forwarding Throughput with Data Compression , 2011, 2011 IEEE International Conference on Cluster Computing.

[31]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[32]  Jim Zelenka,et al.  A cost-effective, high-bandwidth storage architecture , 1998, ASPLOS VIII.

[33]  J. Eliot B. Moss,et al.  Design of the Mneme persistent object store , 1990, TOIS.