A holistic cross-layer optimization approach for mitigating stragglers in in-memory data processing

Abstract In-memory data processing frameworks (e.g., Spark) make big data analysis greatly simpler and efficient. However, stragglers that take much longer to finish than other tasks significantly degrade performance. There exist multiple factors that cause stragglers, either from the hardware resource layer or application layer, e.g. hardware heterogeneity, interference, data locality and data skew. While state-of-the-art straggler mitigation techniques have presented partial solutions on data skew and data locality, we experimentally demonstrate that the other factors can also result in serious problems. We present Clio, a cross-layer interference-aware optimization system that can effectively mitigate stragglers for data processing frameworks. Clio supports the scheduling of both map and reduce tasks. It heuristically dispatches intermediate data in proportion to the actual computing ability of each worker node, which is estimated considering various straggler factors, to balance the completion times of tasks in a much finer way. We implement Clio in Apache Spark, and evaluate its performance using both synthetic and real datasets. Experiment results show that, Clio can speed up the execution of applications by up to 67%, compared with the existing algorithms.

[1]  Zhipeng Cai,et al.  A Private and Efficient Mechanism for Data Uploading in Smart Cyber-Physical Systems , 2020, IEEE Transactions on Network Science and Engineering.

[2]  Hai Jin,et al.  LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[3]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[4]  Zhen Xiao,et al.  LIBRA: Lightweight Data Skew Mitigation in MapReduce , 2015, IEEE Transactions on Parallel and Distributed Systems.

[5]  Randy H. Katz,et al.  Wrangler: Predictable and Faster Jobs using Fewer Resources , 2014, SoCC.

[6]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[7]  Randy H. Katz,et al.  Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud , 2011, HotCloud.

[8]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[9]  Scott Shenker,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .

[10]  Hai Jin,et al.  Heterogeneity and Interference-Aware Virtual Machine Provisioning for Predictable Performance in the Cloud , 2016, IEEE Transactions on Computers.

[11]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[12]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[13]  Kenli Li,et al.  An intermediate data placement algorithm for load balancing in Spark computing environment , 2018, Future Gener. Comput. Syst..

[14]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[15]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[16]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[17]  Michael Isard,et al.  Distributed aggregation for data-parallel computing: interfaces and implementations , 2009, SOSP '09.

[18]  Malgorzata Steinder,et al.  Performance-driven task co-scheduling for MapReduce environments , 2010, 2010 IEEE Network Operations and Management Symposium - NOMS 2010.

[19]  Zhipeng Cai,et al.  Trading Private Range Counting over Big IoT Data , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[20]  Jeffrey Dean,et al.  Achieving Rapid Response Times in Large Online Services , 2012 .

[21]  Xiaomin Zhu,et al.  SP-Partitioner: A novel partition method to handle intermediate data skew in spark streaming , 2017, Future Gener. Comput. Syst..

[22]  Bowen Zhou,et al.  Pythia: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads , 2018, Middleware.

[23]  Zhen Xiao,et al.  Improving MapReduce Performance Using Smart Speculative Execution Strategy , 2014, IEEE Transactions on Computers.

[24]  Kun-Lung Wu,et al.  FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads , 2010, Middleware.

[25]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[26]  Zhipeng Cai,et al.  The Energy-Data Dual Coverage in Battery-free Sensor Networks , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).

[27]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[28]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[29]  Cheng-Zhong Xu,et al.  Interference and locality-aware task scheduling for MapReduce applications in virtual clusters , 2013, HPDC.

[30]  Mahmut T. Kandemir,et al.  MROrchestrator: A Fine-Grained Resource Orchestration Framework for MapReduce Clusters , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[31]  Nikolaus Augsten,et al.  Handling Data Skew in MapReduce , 2011, CLOSER.

[32]  Kenli Li,et al.  A MapReduce task scheduling algorithm for deadline constraints , 2013, Cluster Computing.

[33]  Haiyang Hu,et al.  ARES: Aggressive Replication Enabled Scheduler for Hadoop Systems , 2014 .

[34]  Nikolaus Augsten,et al.  Load Balancing in MapReduce Based on Scalable Cardinality Estimates , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[35]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[36]  Raouf Boutaba,et al.  Dynamic Resource Allocation for MapReduce with Partitioning Skew , 2016, IEEE Transactions on Computers.

[37]  Zhuo Tang,et al.  Optimizing Speculative Execution in Spark Heterogeneous Environments , 2019 .