论文信息 - Themis: an I/O-efficient MapReduce

Themis: an I/O-efficient MapReduce

"Big Data" computing increasingly utilizes the MapReduce programming model for scalable processing of large data collections. Many MapReduce jobs are I/O-bound, and so minimizing the number of I/O operations is critical to improving their performance. In this work, we present Themis, a MapReduce implementation that reads and writes data records to disk exactly twice, which is the minimum amount possible for data sets that cannot fit in memory. In order to minimize I/O, Themis makes fundamentally different design decisions from previous MapReduce implementations. Themis performs a wide variety of MapReduce jobs -- including click log analysis, DNA read sequence alignment, and PageRank -- at nearly the speed of TritonSort's record-setting sort performance [29].

[1] Bruce G. Lindsay,et al. Random sampling techniques for space efficient online computation of order statistics of large datasets , 1999, SIGMOD '99.

[2] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[3] Eduardo Pinheiro,et al. Failure Trends in a Large Disk Drive Population , 2007, FAST.

[4] George Candea,et al. Microreboot - A Technique for Cheap Recovery , 2004, OSDI.

[5] Liang Lin,et al. Tenzing a SQL implementation on the MapReduce framework , 2011, Proc. VLDB Endow..

[6] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[7] Michael C. Schatz,et al. CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..

[8] L. Alvisi,et al. A Survey of Rollback-Recovery Protocols , 2002 .

[9] Randy H. Katz,et al. Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[10] Frank Dabek,et al. Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[11] Eric Anderson,et al. Efficiency matters! , 2010, OPSR.

[12] Michael D. Ernst,et al. HaLoop , 2010, Proc. VLDB Endow..

[13] Wenbing Zhao. Recovery‐Oriented Computing , 2014 .

[14] Jeffrey Scott Vitter,et al. Random sampling with a reservoir , 1985, TOMS.

[15] Magdalena Balazinska,et al. Skew-resistant parallel processing of feature-extracting scientific user-defined functions , 2010, SoCC '10.

[16] David J. DeWitt,et al. Parallel database systems: the future of high performance database systems , 1992, CACM.

[17] Christopher Olston,et al. Stateful bulk processing for incremental analytics , 2010, SoCC '10.

[18] Raghu Ramakrishnan,et al. Sailfish: a framework for large scale data processing , 2012, SoCC '12.

[19] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[20] Magdalena Balazinska,et al. SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[21] Edward A. Lee,et al. Advances in the dataflow computational model , 1999, Parallel Comput..

[22] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.

[23] K. K. Ramakrishnan,et al. Eliminating receive livelock in an interrupt-driven kernel , 1996, TOCS.

[24] Srinivasan Seshan,et al. Subtleties in Tolerating Correlated Failures in Wide-area Storage Systems , 2006, NSDI.

[25] Andrea C. Arpaci-Dusseau,et al. Fail-stutter fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[26] Dennis K. J. Lin,et al. Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation , 2007, Stat. Comput..

[27] Van-Anh Truong,et al. Availability in Globally Distributed Storage Systems , 2010, OSDI.

[28] Joseph M. Hellerstein,et al. Flux: an adaptive partitioning operator for continuous query systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[29] David J. DeWitt,et al. Parallel sorting on a shared-nothing architecture using probabilistic splitting , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[30] Bianca Schroeder,et al. Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you? , 2007, TOS.

[31] Marios Hadjieleftheriou,et al. Robust Sketching and Aggregation of Distributed Data Streams , 2005 .

[32] Eric Bauer,et al. Practical System Reliability , 2009 .

[33] Amin Vahdat,et al. TritonSort: A Balanced Large-Scale Sorting System , 2011, NSDI.