Learning based opportunistic admission control algorithm for MapReduce as a service

Admission Control has been proven essential to avoid overloading of resources and for meeting user service demands in utility driven grid computing. Recent emergence of Cloud based services and the popularity of MapReduce paradigm in Cloud Computing environments make the problem of admission control intriguing. We propose a model that allows one to offer MapReduce jobs in the form of on-demand services. We present a learning based opportunistic algorithm that admits MapReduce jobs only if they are unlikely to cross the overload threshold set by the service provider. The algorithm meets deadlines negotiated by users in more than 80% of cases. We employ an automatically supervised Naive Bayes Classifier to label incoming jobs as admissible and non-admissible. From the list of jobs classified as admissible, we then pick a job that is expected to maximize service provider utility. An external supervision rule automatically evaluates decisions made by the algorithm in retrospect, and trains the classifier. We evaluate our algorithm by simulating a MapReduce cluster hosted in the Cloud that offers a set of MapReduce jobs as services to its users. Our results show that admission control is useful in minimizing losses due to overloading of resources, and by choosing jobs that maximize revenue of the service provider.

[1]  Matei Zaharia,et al.  Job Scheduling for Multi-User MapReduce Clusters , 2009 .

[2]  Rajkumar Buyya,et al.  Market-oriented Grids and Utility Computing: The State-of-the-art and Future Directions , 2008, Journal of Grid Computing.

[3]  Alvin AuYoung,et al.  Service contracts and aggregate utility functions , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[4]  Martin Bichler,et al.  Admission control for media on demand services , 2007, Service Oriented Computing and Applications.

[5]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  David E. Culler,et al.  User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[8]  David E. Irwin,et al.  Balancing risk and reward in a market-based task service , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[9]  John Wilkes,et al.  Profitable services in an uncertain world , 2005, ACM/IEEE SC 2005 Conference (SC'05).