An Adaptive Storage and Retrieval Mechanism to Reduce Response-Time in High Performance Computing Clusters

Ever-increasing growth of high performance computing applications requires employment of novel methods in all aspects of computing systems. The response time of file storage and retrieval operations is one of the most important factors of storage systems and improving that will result higher computational power. Consequently, breathtaking efforts have been done and various file systems with different architectures have been proposed. Most of them are not aware of clusters’ execution state and do not consider variety of I/O operations’ response time on machines with different storage media, network traffic, and processing load. In this paper, we have proposed a mechanism to store and retrieve files with respect to the execution state of storage nodes and network topology of the cluster. Finally, the proposed architecture has been implemented and evaluated using Hadoop distributed file system.

[1]  Burton Smith Proceedings of the 21st annual international conference on Supercomputing , 2007 .

[2]  Dhabaleswar K. Panda,et al.  High performance support of parallel virtual file system (PVFS2) over Quadrics , 2005, ICS '05.

[3]  Pete Wyckoff,et al.  Integrating parallel file systems with object-based storage devices , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[4]  D. Nurmi,et al.  A Case Study in Application I/O on Linux Clusters , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Brian Whitworth,et al.  The web of system performance , 2006, CACM.

[7]  GhemawatSanjay,et al.  The Google file system , 2003 .

[8]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[9]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[10]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[11]  Gabriel Antoniu,et al.  BlobSeer: Next-generation data management for large scale infrastructures , 2011, J. Parallel Distributed Comput..

[12]  H. Apte,et al.  Serverless Network File Systems , 2006 .

[13]  Rajkumar Buyya,et al.  High Performance Cluster Computing , 1999 .

[14]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[15]  Rob Pike,et al.  Plan 9, A Distributed System , 2014 .

[16]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[17]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[18]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).