A Hybrid Resource Scheduling Strategy in Speculative Execution Based on Non-cooperative Game Theory

Hadoop is a well-known parallel computing framework for processing large-scale data, but there is such a task in the Hadoop framework called the “Straggling task” and has a serious impact on Hadoop. Speculative execution is an efficient method of processing “Straggling Tasks” by monitoring the real-time rate of running tasks and backing up “Straggler” on another node to increase the chance of an early completion of a backup task. The proposed speculative execution strategy has many problems, such as misjudgement of “Straggling task” and improper selection of backup nodes, which leads to inefficient implementation of speculative execution. This paper proposes a hybrid resource scheduling strategy in speculative execution based on non-cooperative game theory (HRSE), which transforms the resource scheduling of backup task in speculative execution into a multi-party non-cooperative game problem. The backup task group is the game participant and the game strategy is the computing node, the utility function is the overall task execution time of the cluster. When the game reaches the Nash equilibrium state, the final resource scheduling scheme is obtained. Finally, we implemented the strategy in Hadoop-2.6.0, experimental results show that the scheduling scheme can guarantee the efficiency of speculative execution and improve the fault-tolerant performance of the computation under the condition of high cluster load.

[1]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[4]  Baogang Wei,et al.  Improving MapReduce Performance with Partial Speculative Execution , 2015, Journal of Grid Computing.

[5]  Xiaodong Liu,et al.  A speculative approach to spatial-temporal efficiency with multi-objective optimization in a heterogeneous cloud environment , 2016, Secur. Commun. Networks.

[6]  Yi-Ru Chen,et al.  Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds , 2015, J. Netw. Comput. Appl..

[7]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[8]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[9]  Changjun Jiang,et al.  Moving Hadoop into the Cloud with Flexible Slot Management and Speculative Execution , 2017, IEEE Transactions on Parallel and Distributed Systems.

[10]  Zhen Xiao,et al.  Improving MapReduce Performance Using Smart Speculative Execution Strategy , 2014, IEEE Transactions on Computers.

[11]  Atul Negi,et al.  A review of adaptive approaches to MapReduce scheduling in heterogeneous environments , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[12]  Kenli Li,et al.  A Heuristic Speculative Execution Strategy in Heterogeneous Distributed Environments , 2014, 2014 Sixth International Symposium on Parallel Architectures, Algorithms and Programming.

[13]  Xiaodong Liu,et al.  A Survey of Speculative Execution Strategy in MapReduce , 2016, ICCCS.

[14]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[15]  Minjie Zhang,et al.  An intelligent agent‐based method for task allocation in competitive cloud environments , 2018, Concurr. Comput. Pract. Exp..

[16]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[17]  Minjie Zhang,et al.  An Auction-Based Approach for Group Task Allocation in an Open Network Environment , 2016, Comput. J..

[18]  Xin Huang,et al.  Novel heuristic speculative execution strategies in heterogeneous distributed environments , 2016, Comput. Electr. Eng..

[19]  Bohan Li,et al.  A New Speculative Execution Algorithm Based on C4.5 Decision Tree for Hadoop , 2015, ICYCSEE.

[20]  Gordon S. Blair,et al.  A generic component model for building systems software , 2008, TOCS.

[21]  Kwang Mong Sim,et al.  A comparative review of job scheduling for MapReduce , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[22]  T. S. Eugene Ng,et al.  Understanding the effects and implications of compute node related failures in hadoop , 2012, HPDC '12.

[23]  Jian Shen,et al.  A Smart Strategy for Speculative Execution Based on Hardware Resource in a Heterogeneous Distributed Environment , 2016 .

[24]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[25]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .