论文信息 - A Cost-Effective Distribution-Aware Data Replication Scheme for Parallel I/O Systems

A Cost-Effective Distribution-Aware Data Replication Scheme for Parallel I/O Systems

As data volumes of high-performance computing applications continuously increase, low I/O performance becomes a fatal bottleneck of these data-intensive applications. Data replication is a promising approach to improve parallel I/O performance. However, most existing strategies are designed based on the assumption that contiguous requests are being served more efficiently than non-contiguous requests, which is not necessarily true in a parallel I/O system. The reason is that the multiple-server data distribution makes the favorable accesses between contiguous requests and non-contiguous ones indeterminate. In this study, we propose CEDA, a cost-effective distribution-aware data replication scheme to better support parallel I/O systems. As logical file access information is inefficient to make replication decisions in a parallel environment, CEDA considers physical data accesses on servers in both data selection and data placement during a parallel replication process. Specifically, CEDA first proposes a distribution-aware cost model to evaluate the file request time with a given data layout, and then it carries out cost-effective data replication based on replication benefit analysis. We have implemented CEDA as a part of the MPI I/O library in light of high portability on top of the OrangeFS file system. By replaying representative benchmarks and a real application, we collected comprehensive experimental results on both HDD- and SSD-based servers and conclude that CEDA can significantly improve parallel I/O system performance.

Xian-He Sun | Shuibing He | Shuibing He | Xian-He Sun

[1] Chao Wang,et al. Improving the availability of supercomputer job input data using temporal replication , 2009, Computer Science - Research and Development.

[2] Surendra Byna,et al. Parallel I/O prefetching using MPI file caching and I/O signatures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[3] Jun Wang,et al. MRAP: a novel MapReduce-based framework to support HPC analytics applications with access patterns , 2010, HPDC '10.

[4] Usage Pattern-Driven Dynamic Data Layout Reorganization , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[5] Song Jiang,et al. InterferenceRemoval: removing interference of disk access for MPI programs through data replication , 2010, ICS '10.

[6] Wei-keng Liao,et al. Evaluating I/O characteristics and methods for storing structured scientific data , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[7] Surendra Byna,et al. Boosting Application-Specific Parallel I/O Optimization Using IOSIG , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[8] Raju Rangaswami,et al. I/O Deduplication: Utilizing content similarity to improve I/O performance , 2010, TOS.

[9] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[10] Friedhelm Meyer auf der Heide,et al. Dynamic and Redundant Data Placement , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[11] Vagelis Hristidis,et al. BORG: Block-reORGanization for Self-optimizing Storage Systems , 2009, FAST.

[12] David R. Kaeli,et al. Profile-guided I/O partitioning , 2003, ICS '03.

[13] Kang G. Shin,et al. FS2: dynamic data replication in free disk space for improving disk performance and energy consumption , 2005, SOSP '05.

[14] Robert Latham,et al. Parallel I/O in practice , 2006, SC.

[15] Galen M. Shipman,et al. LADS: Optimizing Data Transfers Using Layout-Aware Data Scheduling , 2015, FAST.

[16] Frank B. Schmuck,et al. GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[17] John Bent,et al. PLFS: a checkpoint filesystem for parallel applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[18] Xin Huang,et al. A cost-aware region-level data placement scheme for hybrid parallel I/O systems , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[19] Xian-He Sun,et al. HAS: Heterogeneity-Aware Selective Data Layout Scheme for Parallel File Systems on Hybrid Servers , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[20] Robert B. Ross,et al. RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication , 2014, ISC.

[21] Rajeev Thakur,et al. Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[22] Yang Wang,et al. Heterogeneity-Aware Collective I/O for Parallel I/O Systems with Hybrid HDD/SSD Servers , 2017, IEEE Transactions on Computers.

[23] Karan Gupta,et al. GPFS-SNC: An enterprise storage framework for virtual-machine clouds , 2011, IBM J. Res. Dev..

[24] Xian-He Sun,et al. S4D-Cache: Smart Selective SSD Cache for Parallel I/O Systems , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[25] Margo I. Seltzer,et al. Berkeley DB , 1999, USENIX Annual Technical Conference, FREENIX Track.

[26] Yang Wang,et al. Boosting Parallel File System Performance via Heterogeneity-Aware Selective Data Layout , 2016, IEEE Transactions on Parallel and Distributed Systems.

[27] Xian-He Sun,et al. A cost-intelligent application-specific data layout scheme for parallel file systems , 2011, HPDC '11.

[28] T.M. Madhyastha,et al. Exploiting Global Input Output Access Pattern Classification , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[29] Ibrahim F. Haddad,et al. PVFS: A Parallel Virtual File System for Linux Clusters , 2000 .

[30] Yang Liu,et al. Automatic identification of application I/O signatures from noisy server-side traces , 2014, FAST.

[31] Yang Wang,et al. HARL: Optimizing Parallel File Systems with Heterogeneity-Aware Region-Level Data Layout , 2017, IEEE Transactions on Computers.

[32] André Brinkmann,et al. Redundant Data Placement Strategies for Cluster Storage Environments , 2008, OPODIS.

[33] Jun He,et al. Pattern-aware file reorganization in MPI-IO , 2011, PDSW '11.

[34] Rajeev Thakur,et al. Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[35] Yang Wang,et al. Improving Performance of Parallel I/O Systems through Selective and Layout-Aware SSD Cache , 2016, IEEE Transactions on Parallel and Distributed Systems.