Towards Control of MapReduce Performance and Availability

MapReduce is a popular programming model for distributed data processing and Big Data applications. Extensive research has been conducted either to improve the dependability or to increase performance of MapReduce, ranging from adaptive and on-demand fault-tolerance solutions, adaptive task scheduling techniques to optimized job execution mechanisms. This paper investigates a novel solution that controls MapReduce systems and provides guarantees in terms of both performance and availability, while reducing utilization costs. We follow a control theoretic approach for MapReduce cluster scaling and admission control. Preliminary results based on a simulation environment, previously validated on a real MapReduce cluster, show the effectiveness of the proposed control solutions for a Hadoop MapReduce cluster.