Formal performance evaluation of the Map/Reduce framework within cloud computing

The recent appearance, evolution and massive expansion of social media-based technologies, in conjunction with what currently is known as Internet of Things, results in a vertiginous data production. One of the main contributions to address this matter has been the Hadoop framework (which implements the Map/Reduce paradigm), especially when used in conjunction with Cloud computing environments. In this paper, a comprehensive and rigourous study of the Map/Reduce framework using formal methods is presented. Specifically, the Timed Process Algebra BTC is used, and the resulting formal model is evaluated with a real social media data Hadoop-based application. Moreover, the formal model is validated by carrying out several experiments on a real private Cloud environment. Finally, the formal model outcomes are harnessed to determine the best performance–cost agreement in a real scenario. Results show that the proposed model enables to determine in advance both the performance of a Hadoop-based application within Cloud environments and the best performance–cost agreement.

[1]  Cong Li,et al.  Kernel-based Virtual Machine , 2017 .

[2]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[3]  Beng Chin Ooi,et al.  The performance of MapReduce , 2010, Proc. VLDB Endow..

[4]  M. Carmen Ruiz,et al.  Analysis of the SET e-commerce protocol using a true concurrency process algebra , 2006, SAC.

[5]  Masami Hagiya,et al.  Using Coq in Specification and Program Extraction of Hadoop MapReduce Applications , 2011, SEFM.

[6]  Qin Li,et al.  Formalizing MapReduce with CSP , 2010, 2010 17th IEEE International Conference and Workshops on Engineering of Computer Based Systems.

[7]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[8]  Shivnath Babu,et al.  Towards automatic optimization of MapReduce programs , 2010, SoCC '10.

[9]  吉村 允孝 System design optimization for product manufacturing , 2010 .

[10]  P. Anderson What is Web 2.0? Ideas, technologies and implications for education , 2007 .

[11]  Jim Woodcock,et al.  FDR Explorer , 2007, Formal Aspects of Computing.

[12]  M. Carmen Ruiz,et al.  A Bounded True Concurrency Process Algebra for Performance Evaluation , 2004, FORTE Workshops.

[13]  Omer F. Rana,et al.  Scaling Archived Social Media Data Analysis Using a Hadoop Cloud , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[14]  Masataka Yoshimura,et al.  System Design Optimization for Product Manufacturing , 2007, Concurr. Eng. Res. Appl..