Resource Provisioning for MapReduce Computation in Cloud Container Environment

MapReduce is a major computing model for big data solutions through distributed virtual computing environment. Cloud container environment is one of the platforms to compute MapReduce tasks. However, a new challenge lies on the lack of resource provisioning for containerized MapReduce computations with deadline requirements. There are two major resource provisioning strategies to solve this challenge: static and dynamic, but neither of them can satisfactorily solve it. This paper presents a resource provisioning framework, integrating semi-static and dynamic strategies, to address this challenge. The framework includes a performance model to estimate minimum resource requirements under deadline limitation, and a scheduler to adjust resource allocation. Experimental results show that the proposed semi-static framework can complete the MapReduce computation with less resource utilization and meeting the given deadline. However, proposed dynamic resource provisioning is not suitable for our scenario caused by resource overhead and late completion.