Breaking the boundary for whole-system performance optimization of big data

MapReduce plays an critical role in finding insights in Big Data. The performance optimization of MapReduce programs is challenging because it requires a comprehensive understanding of the whole system including both hardware layers (processors, storages, networks and etc), and software stacks (operating systems, JVM, runtime, applications and etc). However, most of the existing performance tuning and optimization are based on empirical and heuristic attempts. It remains a blank on how to build a systematical framework which breaks the boundary of multiple layers for performance optimization. In this paper, we propose a performance evaluation framework by correlating performance metrics from different layers, which provides insights to efficiently pinpoint the performance issue. This framework is composed of a series of predefined patterns. Each pattern indicates one or more potential issues. The behavior of a MapReduce program is mapped to the corresponding resource utilization. The framework provides a holistic approach which allows users at different levels of experience to conduct MapReduce program performance optimization. We use Terasort benchmark running on a 10-node Power7R2 cluster as a real case to show how this framework improves the performance. By this framework, we finally get the Terasort result improved from 47 mins to less than 8 mins. In addition to the best practice on performance tuning, several key findings are summarized as valuable workload analysis for JVM, MapReduce runtime and application design.

[1]  Guanying Wang,et al.  A simulation approach to evaluating design decisions in MapReduce setups , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[2]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[3]  Moustafa Ghanem,et al.  Improving Resource Utilisation in the Cloud Environment Using Multivariate Probabilistic Models , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[4]  Balaram Sinharoy,et al.  IBM POWER7 multicore server processor , 2011 .

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Mohammad Hammoud,et al.  Locality-Aware Reduce Task Scheduling for MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[7]  Lavanya Ramakrishnan,et al.  Evaluating Hadoop for Data-Intensive Scientific Operations , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[8]  José A. B. Fortes,et al.  Grey-Box Approach for Performance Prediction in Map-Reduce Based Platforms , 2012, 2012 21st International Conference on Computer Communications and Networks (ICCCN).

[9]  Geoffrey C. Fox,et al.  MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.

[10]  Wei Pan,et al.  MSMapper: An Adaptive Split Assignment Scheme for MapReduce , 2012, WAIM Workshops.

[11]  Pradeep Dubey,et al.  CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster , 2012, SIGMOD Conference.

[12]  H. Peter Hofstee,et al.  Understanding System and Architecture for Big Data , 2012 .

[13]  Xiaoqiao Meng,et al.  Performance analysis of Coupling Scheduler for MapReduce/Hadoop , 2012, 2012 Proceedings IEEE INFOCOM.