Resource Provisioning and Scheduling of Big Data Processing Jobs

Cloud Computing has become a buzzword in the IT industry. Cloud Computing which provides inexpensive computing resources on the pay-as-you-go basis is promptly gaining momentum as a substitute for traditional Information Technology (IT) based organizations. Therefore, the increased utilization of Clouds makes an execution of Big Data processing jobs a vital research area. As more and more users have started to store/process their real-time data in Cloud environments, Resource Provisioning and Scheduling of Big Data processing jobs becomes a key element of consideration for efficient execution of Big Data applications. This chapter discusses the fundamental concepts supporting Cloud Computing & Big Data terms and the relationship between them. This chapter will help researchers find the important characteristics of Cloud Resource Management Systems to handle Big Data processing jobs and will also help to select the most suitable technique for processing Big Data jobs in Cloud Computing environment.

[1]  Jean-Marc Menaud,et al.  SLA-Aware Virtual Resource Management for Cloud Infrastructures , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[2]  Jun Zhang,et al.  An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem With Various QoS Requirements , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  G. Bruce Berriman,et al.  Data Sharing Options for Scientific Workflows on Amazon EC2 , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  John W. Rittinghouse,et al.  Cloud Computing: Implementation, Management, and Security , 2009 .

[5]  Stefan Katzenbeisser,et al.  Efficient Privacy-Preserving Big Data Processing through Proxy-Assisted ORAM , 2014, IACR Cryptol. ePrint Arch..

[6]  Marta Mattoso,et al.  A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds , 2012, Journal of Grid Computing.

[7]  Rolf Stadler,et al.  Resource Management in Clouds: Survey and Research Challenges , 2015, Journal of Network and Systems Management.

[8]  Rajkumar Buyya,et al.  Mastering Cloud Computing: Foundations and Applications Programming , 2013 .

[9]  Bu-Sung Lee,et al.  Optimization of Resource Provisioning Cost in Cloud Computing , 2012, IEEE Transactions on Services Computing.

[10]  Ewelina Kempa Social media addiction : The paradox of visibility a vulnerability , 2015 .

[11]  Edmundo Roberto Mauro Madeira,et al.  A performance-oriented adaptive scheduler for dependent tasks on grids , 2008 .

[12]  Helen D. Karatza,et al.  The impact of task service time variability on gang scheduling performance in a two-cluster system , 2009, Simul. Model. Pract. Theory.

[13]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[15]  Song Guo,et al.  Load balancing for privacy-preserving access to big data in cloud , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[16]  Luiz Fernando Bittencourt,et al.  A performance‐oriented adaptive scheduler for dependent tasks on grids , 2008, Concurr. Comput. Pract. Exp..

[17]  Xiao Liu,et al.  A market-oriented hierarchical scheduling strategy in cloud workflow systems , 2011, The Journal of Supercomputing.

[18]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[19]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[20]  Yang Zhang,et al.  Combined Fault Tolerance and Scheduling Techniques for Workflow Applications on Computational Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[21]  Rajkumar Buyya Chapter 10 – Cloud Applications , 2013 .

[22]  T. Karthick,et al.  Privacy Preserving and Load Balancing For Secure Cloud Storage , 2014 .

[23]  Yadong Gong,et al.  A Survey of Cloud Computing , 2013 .

[24]  G. Bruce Berriman,et al.  Scientific workflow applications on Amazon EC2 , 2010, 2009 5th IEEE International Conference on E-Science Workshops.

[25]  Ewa Deelman,et al.  Scientific workflows and clouds , 2010, ACM Crossroads.

[26]  G. Bruce Berriman,et al.  An Evaluation of the Cost and Performance of Scientific Workflows on Amazon EC2 , 2012, Journal of Grid Computing.

[27]  Ewa Deelman,et al.  Resource Provisioning Options for Large-Scale Scientific Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[28]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[29]  Alexandru Iosup,et al.  An Analysis of Provisioning and Allocation Policies for Infrastructure-as-a-Service Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[30]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[31]  Li-zhen Cui,et al.  A Multiple QoS Constrained Scheduling Strategy of Multiple Workflows for Cloud Computing , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[32]  Julien Gossa,et al.  Comparing Provisioning and Scheduling Strategies for Workflows on Clouds , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.