论文信息 - Research on mixed tasks scheduling in YARN

Research on mixed tasks scheduling in YARN

YARN provides support for a variety of computing frameworks and different types of jobs, different types of computing tasks and jobs can be run on the YARN platform, different job types can divided into batch jobs and interactive jobs. As for the single job, it can be divided into three kinds: the small job, the middle job and the big job. Based on the different combinations and quantities of these assignments, this paper divides the assignments into single, mixed, and batch jobs, simulates interactive jobs with batch small jobs and simulates batch jobs with batch big jobs. And the paper studies the influence of different scheduling strategies on the implementation of different job types. Experiments show that for the interactive job, the performance of Fair scheduler is better than Capacity; for batch job, under the premise of resources are not limited, Fair is also better than the performance of Capacity.

Chao Zhang | Junmin Wu | Zhaocong Wen | Jintao Mo

[1] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[2] Muthu Dayalan,et al. MapReduce : Simplified Data Processing on Large Cluster , 2018 .

[3] Matei Zaharia,et al. Job Scheduling for Multi-User MapReduce Clusters , 2009 .

[4] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[5] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[6] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).