An object interface storage node for clustered file systems

In order to sustain scalability, clustered file systems distribute files across multiple I/O nodes in the cluster. The basis of many of these file systems is an object storage architecture where data is represented by variable sized containers called objects. The I/O nodes, however, use a POSIX file and directory interface which do not map well to the object interface. As a result, object storage nodes still have to resolve file paths and perform expensive lookup operations to find inodes and open objects stored as files. In this paper, we propose an alternative object based data interface where I/O node data can be accessed directly using object IDs instead of character strings. We modified an existing Linux file system to provide the object based interface which allows for faster data access times compared to the traditional directory-based interface. In addition, we modified a Object Storage Device (OSD) to use our object based interface to gain improvements in object creation and access. We believe this object based interface provides a useful alternative to the existing interface to provide better performance for cluster storage systems.

[1]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[2]  Scott A. Brandt,et al.  OBFS: A File System for Object-Based Storage Devices , 2004, MSST.

[3]  Gregory R. Ganger,et al.  Improving Small File Performance in Object-based Storage (CMU-PDL-06-104) , 2006 .

[4]  Kang-Tsung Chang,et al.  Introduction to Geographic Information Systems , 2001 .

[5]  Pete Wyckoff,et al.  Integrating parallel file systems with object-based storage devices , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[6]  Xubin He,et al.  Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[7]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[8]  Robert Love,et al.  Linux Kernel Development , 2003 .

[9]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[10]  Sumit Narayan,et al.  I/O characterization on a parallel file system , 2010, Proceedings of the 2010 International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS '10).

[11]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[12]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[13]  Kang-Tsung Chang,et al.  Introduction to geographic information systems (4. ed.) , 2008 .