SARN: A scalable resource managing framework for YARN

With the fast growth of Internet, we have entered the era of big data. In the big data era, Hadoop Yet Another Resource Negotiator (YARN) is one of the common used framework for big data processing. YARN provides explicit support for programming model diversity, so multiple frameworks such as Storm, Hbase and Hive can run as applications on YARN [1]. In order to make better use of hardware resources and improve the cluster efficiency, a scalable resource managing framework (SARN) is presented. SARN includes the dynamic container and a quick deploy component for elastically expanding or shrinking the number of YARN'S work node to meet the actual resource needs of multiple tasks. I do experiment under both idle mode (the cluster is idle) and eager mode (the cluster resource is relatively not enough to meet the application's requirements). The experimental result shows that the approach can improve the performance applications under both situations.

[1]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[2]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  Bo Hong,et al.  Clotho: an elastic MapReduce workload/runtime co-design , 2013, ARM '13.

[4]  Mohamed Faten Zhani,et al.  DREAMS: Dynamic resource allocation for MapReduce with data skew , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[5]  Qi Zhang,et al.  Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study , 2014, 2014 IEEE 7th International Conference on Cloud Computing.

[6]  Cees Witteveen,et al.  ThroughputScheduler: Learning to Schedule on Heterogeneous Hadoop Clusters , 2013, ICAC.

[7]  Minghua Chen,et al.  Moving Big Data to The Cloud: An Online Cost-Minimizing Approach , 2013, IEEE Journal on Selected Areas in Communications.

[8]  Dick H. J. Epema,et al.  Dynamically Scheduling a Component-Based Framework in Clusters , 2014, JSSPP.

[9]  Chen He,et al.  HOG: Distributed Hadoop MapReduce on the Grid , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[10]  Roy H. Campbell,et al.  ARIA: automatic resource inference and allocation for mapreduce environments , 2011, ICAC '11.