aHDFS: An Erasure-Coded Data Archival System for Hadoop Clusters

In this paper, we propose an erasure-coded data archival system called <italic>aHDFS</italic> for Hadoop clusters, where <inline-formula><tex-math notation="LaTeX">$RS(k+r,k)$</tex-math><alternatives> <inline-graphic xlink:href="chen-ieq1-2706686.gif"/></alternatives></inline-formula> codes are employed to archive data replicas in the Hadoop distributed file system or HDFS. We develop two archival strategies (i.e., <italic> aHDFS-Grouping</italic> and <italic>aHDFS-Pipeline</italic>) in <italic>aHDFS</italic> to speed up the data archival process. <italic>aHDFS-Grouping</italic> - a MapReduce-based data archiving scheme - keeps each mapper’s intermediate output <italic>Key-Value</italic> pairs in a local key-value store. With the local store in place, <italic>aHDFS-Grouping</italic> merges all the intermediate <italic>key-value</italic> pairs with the same key into one single <italic>key-value</italic> pair, followed by shuffling the single <italic>Key-Value</italic> pair to reducers to generate final parity blocks. <italic>aHDFS-Pipeline</italic> forms a data archival pipeline using multiple data node in a Hadoop cluster. <italic>aHDFS-Pipeline</italic> delivers the merged single <italic>key-value</italic> pair to a subsequent node’s local key-value store. Last node in the pipeline is responsible for outputting parity blocks. We implement <italic>aHDFS</italic> in a real-world Hadoop cluster. The experimental results show that <italic> aHDFS-Grouping</italic> and <italic>aHDFS-Pipeline</italic> speed up <italic>Baseline</italic>’s shuffle and reduce phases by a factor of 10 and 5, respectively. When block size is larger than 32 MB, <italic>aHDFS</italic> improves the performance of <italic>HDFS-RAID</italic> and <italic>HDFS-EC</italic> by approximately 31.8 and 15.7 percent, respectively.

[1]  Donald Robinson,et al.  Amazon Web Services Made Simple: Learn how Amazon EC2, S3, SimpleDB and SQS Web Services enables you to reach business goals faster , 2008 .

[2]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[3]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[4]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[5]  Frédérique E. Oggier,et al.  Decentralized Erasure Coding for Efficient Data Archival in Distributed Storage Systems , 2013, ICDCN.

[6]  Xiao Qin,et al.  Exploiting Pipelined Encoding Process to Boost Erasure-Coded Data Archival , 2015, IEEE Transactions on Parallel and Distributed Systems.

[7]  Jian Zhou,et al.  Opass: Analysis and Optimization of Parallel Data Access on Distributed File Systems , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[8]  Kannan Ramchandran,et al.  A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers , 2015, SIGCOMM 2015.

[9]  Darrell D. E. Long,et al.  Deep Store: an archival storage system architecture , 2005, 21st International Conference on Data Engineering (ICDE'05).

[10]  Sriram Rao,et al.  A The Quantcast File System , 2013, Proc. VLDB Endow..

[11]  Patrick P. C. Lee,et al.  Parity logging with reserved space: towards efficient updates and recovery in erasure-coded clustered storage , 2014, FAST.

[12]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[13]  GhemawatSanjay,et al.  The Google file system , 2003 .

[14]  Spencer W. Ng,et al.  Disk scrubbing in large archival storage systems , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[15]  Mukesh K. Mohania,et al.  Efficiently querying archived data using Hadoop , 2010, CIKM '10.

[16]  Weisong Shi,et al.  Workload Analysis, Implications, and Optimization on a Production Hadoop Cluster: A Case Study on Taobao , 2014, IEEE Transactions on Services Computing.

[17]  Frédérique E. Oggier,et al.  RapidRAID: Pipelined erasure codes for fast data archival in distributed storage systems , 2013, 2013 Proceedings IEEE INFOCOM.

[18]  Xiao Qin,et al.  Scale-RS: An Efficient Scaling Scheme for RS-Coded Storage Clusters , 2015, IEEE Transactions on Parallel and Distributed Systems.

[19]  Xiao Qin,et al.  Exploiting Redundancies and Deferred Writes to Conserve Energy in Erasure-Coded Storage Clusters , 2013, TOS.

[20]  Jun Wang,et al.  DRAW: A New Data-gRouping-AWare Data Placement Scheme for Data Intensive Applications With Interest Locality , 2012, IEEE Transactions on Magnetics.

[21]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[22]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[23]  Klara Nahrstedt,et al.  T*: A data-centric cooling energy costs reduction approach for Big Data analytics cloud , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Cheng Huang,et al.  Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads , 2012, FAST.

[25]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[26]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[27]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[28]  Mario Blaum,et al.  A Tale of Two Erasure Codes in HDFS , 2015, FAST.

[29]  Ming Tang,et al.  The impact of data replication on job scheduling performance in the Data Grid , 2006, Future Gener. Comput. Syst..