论文信息 - Resource scheduling method under Hadoop-based multi-job environment

Resource scheduling method under Hadoop-based multi-job environment

The invention discloses a resource scheduling method under a Hadoop-based multi-job environment, which includes: (1) collecting the three-party monitoring information of cluster loads, a Hadoop platform and hardware in real time; (2) collecting the job execution monitoring information of a user on each computing node of a cluster in real time; (3) gathering the three-party monitoring data of the cluster, modeling to evaluate the computing capabilities of the nodes, and dividing the nodes of the cluster into superior computing nodes and inferior computing nodes; (4) if the nodes are the superior computing nodes, then starting a job task resource demand allocation policy based on similarity evaluation; (5) if the nodes are the inferior computing nodes, then returning to a default resource demand allocation policy of the Yarn. The resource scheduling method under the Hadoop-based multi-job environment solves the problem of resource fragments caused by oversize job resource demand division granularity in conventional resource schedulers of the Yarn, can comprehensively take the heterogeneity of cluster nodes and jobs into consideration, and increases the execution concurrency of the cluster by reasonably and effectively allocating the node resources, thus increasing the execution efficiency of the multiple jobs of the Hadoop cluster.

王芳 | 冯丹 | 周俊 | 杨静怡 | 潘佳艺