Improving MapReduce Performance with Progress and Feedback Based Speculative Execution

Task stragglers dramatically impede parallel job execution of data-intensive computing in Cloud Datacenters Due to the uneven distribution of input data resulted from heterogeneous data nodes, resource contention situations, and network configurations, it causes delay failures due to the violation of job completion time. However, data-intensive computing frameworks, such as MapReduce or Hadoop, employ a mechanism called speculative execution to deal with the straggler issue, speculative execution provide limited effectiveness because in many cases straggler identification occurs too late within a job lifecycle. Identifying the straggler and the timing of identifying it is very important for Straggler mitigation in Data-intensive cloud computing. Speculative execution method is a widely adopted as a straggler identification and mitigation scheme but it has certain inherent limitations. In this paper, we strive to make Hadoop more efficient in cloud environments. We present Progress and Feedback based Speculative Execution Algorithm (PFSE), a new Straggler identification scheme to identify the straggler MapReduce tasks based on the feedback information received from completed tasks beside the progress of the currently processing task, our extensive simulation shows that PFSE can outperform the dynamic scheduling techniques like Self-Learning MapReduce scheduler (SLM) and LATE. PFSE can assist in enhancing straggler Identification and mitigation for tolerating late-timing failures within data intensive cloud computing.

[1]  Wei Dai,et al.  An improved task assignment scheme for Hadoop running in the clouds , 2013, Journal of Cloud Computing: Advances, Systems and Applications.

[2]  Keke Gai,et al.  Phase-Change Memory Optimization for Green Cloud with Genetic Algorithm , 2015, IEEE Transactions on Computers.

[3]  Meikang Qiu,et al.  Privacy Protection for Preventing Data Over-Collection in Smart City , 2016, IEEE Transactions on Computers.

[4]  Keke Gai,et al.  Cost-Aware Multimedia Data Allocation for Heterogeneous Memory Using Genetic Algorithm in Cloud Computing , 2020, IEEE Transactions on Cloud Computing.

[5]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[6]  Wei Dai,et al.  A New Replica Placement Policy for Hadoop Distributed File System , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[7]  Chen He,et al.  ESAMR: An Enhanced Self-Adaptive MapReduce Scheduling Algorithm , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[8]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[9]  Zhen Xiao,et al.  Improving MapReduce Performance Using Smart Speculative Execution Strategy , 2014, IEEE Transactions on Computers.

[10]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.

[11]  Quan Chen,et al.  SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[12]  Yun Tian,et al.  Improving MapReduce performance through data placement in heterogeneous Hadoop clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[13]  Randy H. Katz,et al.  Wrangler: Predictable and Faster Jobs using Fewer Resources , 2014, SoCC.

[14]  Scott Shenker,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .

[15]  Wei Dai,et al.  Improving Load Balance for Data-Intensive Computing on Cloud Platforms , 2016, 2016 IEEE International Conference on Smart Cloud (SmartCloud).

[16]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[17]  Wenzhong Guo,et al.  Self-Learning MapReduce Scheduler in Multi-job Environment , 2013, 2013 International Conference on Cloud Computing and Big Data.

[18]  Beng Chin Ooi,et al.  The performance of MapReduce , 2010, Proc. VLDB Endow..