Dynamic memory-aware scheduling in spark computing environment

Abstract Scheduling plays an important role in improving the performance of big data-parallel processing. Spark is an in-memory parallel computing framework that uses a multi-threaded model in task scheduling. Most Spark task scheduling processes do not take the memory into account, but the number of concurrent task threads determined by the user. It emerges as a potential limitation for the performance. To overcome the limitations in the Spark-core source code, this paper proposes a dynamic Spark memory-aware task scheduler (DMATS), which not only treats memory and network I/O as a computational resource but also dynamically adjusts concurrency when scheduling tasks. Specifically, we first analyze the RDD based Spark execution engine to obtain the amount of task processing data and propose an algorithm for estimating the initial adaptive task concurrency, which is integrated with the known task input information and the executor memory. Then, a dynamic adjustment algorithm is proposed to change the concurrency dynamically through feedback information to optimally utilize the limited memory resources. We implement a dynamic memory-aware task scheduling (DMATS) in Spark 2.3.4 and evaluate performance with two typical benchmarks, shuffle-light and shuffle-heavy. The results show that the algorithm not only reduces the execution time by 43.64%, but also significantly improves resource utilization. Experiments also show that our proposed method has advantages compared with other similar works such as WASP.

[1]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[2]  Xiang Ming,et al.  An Adaptive Tasks Scheduling Method Based on the Ability of Node in Hadoop Cluster , 2014 .

[3]  Zhen Xiao,et al.  Improving MapReduce Performance Using Smart Speculative Execution Strategy , 2014, IEEE Transactions on Computers.

[4]  Xiaofang Li,et al.  FiGMR: A fine-grained MapReduce scheduler in the heterogeneous cloud , 2016, 2016 IEEE International Conference on Information and Automation (ICIA).

[5]  Bo Li,et al.  Symbiosis: Network-aware task scheduling in data-parallel frameworks , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[6]  Li Zhang,et al.  MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[7]  Keqin Li,et al.  A Data Skew Oriented Reduce Placement Algorithm Based on Sampling , 2020, IEEE Transactions on Cloud Computing.

[8]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[9]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[10]  Vincent W. Freeh,et al.  Dynamically Controlling Node-Level Parallelism in Hadoop , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[11]  Xiaolong Xu,et al.  Adaptive Task Scheduling Strategy Based on Dynamic Workload Adjustment for Heterogeneous Hadoop Clusters , 2016, IEEE Systems Journal.

[12]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[13]  Joo Young Hwang,et al.  Jointly optimizing task granularity and concurrency for in-memory mapreduce frameworks , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[14]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM 2011.

[15]  Gavin Brown,et al.  Garbage collection auto-tuning for Java mapreduce on multi-cores , 2011, ISMM '11.

[16]  Jordi Torres,et al.  Resource-Aware Adaptive Scheduling for MapReduce Clusters , 2011, Middleware.

[17]  Wei Lin,et al.  Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.

[18]  Yang Richard Yang,et al.  General AIMD congestion control , 2000, Proceedings 2000 International Conference on Network Protocols.

[19]  H. Peter Hofstee,et al.  Auto-tuning Spark big data workloads on POWER8: Prediction-based dynamic SMT threading , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[20]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[21]  Mohammad Hammoud,et al.  Locality-Aware Reduce Task Scheduling for MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.