Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions

The emergence of cloud computing has made dynamic provisioning of elastic capacity to applications on-demand. Cloud data centers contain thousands of physical servers hosting orders of magnitude more virtual machines that can be allocated on demand to users in a pay-as-you-go model. However, not all systems are able to scale up by just adding more virtual machines. Therefore, it is essential, even for scalable systems, to project workloads in advance rather than using a purely reactive approach. Given the scale of modern cloud infrastructures generating real time monitoring information, along with all the information generated by operating systems and applications, this data poses the issues of volume, velocity, and variety that are addressed by Big Data approaches. In this paper, we investigate how utilization of Big Data analytics helps in enhancing the operation of cloud computing environments. We discuss diverse applications of Big Data analytics in clouds, open issues for enhancing cloud operations via Big Data analytics, and architecture for anomaly detection and prevention in clouds along with future research directions.

[1]  Arun Kejariwal,et al.  A Novel Technique for Long-Term Anomaly Detection in the Cloud , 2014, HotCloud.

[2]  Kanishka Bhaduri,et al.  Detecting Abnormal Machine Characteristics in Cloud Infrastructures , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[3]  Jie Xu,et al.  Analysis, Modeling and Simulation of Workload Patterns in a Large-Scale Utility Cloud , 2014, IEEE Transactions on Cloud Computing.

[4]  Marion Kee,et al.  Analysis , 2004, Machine Translation.

[5]  Tao Lu,et al.  Clique Migration: Affinity Grouping of Virtual Machines for Inter-cloud Live Migration , 2014, 2014 9th IEEE International Conference on Networking, Architecture, and Storage.

[6]  Rajkumar Buyya,et al.  Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities , 2008, 2008 10th IEEE International Conference on High Performance Computing and Communications.

[7]  Kostas Katrinis,et al.  Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[8]  Chita R. Das,et al.  Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[9]  Qiang Li,et al.  A data placement strategy based on clustering and consistent hashing algorithm in cloud computing , 2014, 9th International Conference on Communications and Networking in China.

[10]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[11]  Rajkumar Buyya,et al.  Virtual Machine Provisioning Based on Analytical Performance and QoS in Cloud Computing Environments , 2011, 2011 International Conference on Parallel Processing.

[12]  Xiaohui Gu,et al.  PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems , 2012, 2012 IEEE 32nd International Conference on Distributed Computing Systems.

[13]  Dongxia Wang,et al.  DAC‐Hmm: detecting anomaly in cloud systems with hidden Markov models , 2015, Concurr. Comput. Pract. Exp..

[14]  George E. P. Box,et al.  Time Series Analysis: Box/Time Series Analysis , 2008 .

[15]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[16]  Franck Cappello,et al.  Characterizing Cloud Applications on a Google Data Center , 2013, 2013 42nd International Conference on Parallel Processing.

[17]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[18]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[19]  Xiaohui Gu,et al.  PerfCompass: Toward Runtime Performance Anomaly Fault Localization for Infrastructure-as-a-Service Clouds , 2014, HotCloud.

[20]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[21]  Kevin Lee,et al.  Empirical prediction models for adaptive resource provisioning in the cloud , 2012, Future Gener. Comput. Syst..

[22]  Rajkumar Buyya,et al.  Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS , 2015, IEEE Transactions on Cloud Computing.

[23]  Michael W. Godfrey,et al.  Storm prediction in a cloud , 2013, 2013 5th International Workshop on Principles of Engineering Service-Oriented Systems (PESOS).