Optimizing the Ceph Distributed File System for High Performance Computing

With increasing demand for running big data analytics and machine learning workloads with diverse data types, high performance computing (HPC) systems consequently need to support diverse types of storage services. Ceph is one possible candidate for such HPC environments, as Ceph provides interfaces for object, block, and file storage. Ceph, however, is not designed for HPC environments, thus it needs to be optimized for HPC workloads. In this paper, we find and analyze problems that arise when running HPC workloads on Ceph, and propose a novel optimization technique called F2FS-split, based on the F2FS file system and several other optimizations. We measure the performance of Ceph in HPC environments, and show that F2FS-split outperforms both F2FS and XFS by 39% and 59%, respectively, in a write dominant workload. We also observe that modifying the Ceph RADOS object size can improve read speed further.

[1]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[2]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[3]  Rajeev Thakur,et al.  Users guide for ROMIO: A high-performance, portable MPI-IO implementation , 1997 .

[4]  Rajeev Thakur,et al.  Users Guide for ROMIO: A High-Performance , 1997 .

[5]  Joo Young Hwang,et al.  F2FS: A New File System for Flash Storage , 2015, FAST.

[6]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[7]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[8]  G Borges,et al.  CephFS: a new generation storage platform for Australian high energy physics , 2017 .

[9]  Sangyeun Cho,et al.  Behaviors of Storage Backends in Ceph Object Store , 2017 .

[10]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Carlos Maltzahn,et al.  RADOS: a scalable, reliable storage service for petabyte-scale storage clusters , 2007, PDSW '07.

[12]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[13]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[14]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.