Measuring data-centre workflows complexity through process mining: the Google cluster case

Data centres have become the backbone of large Cloud services and applications, providing virtually unlimited elastic and scalable computational and storage resources. The search for the efficiency and optimisation of resources is one of the current key aspects for large Cloud Service Providers and is becoming more and more challenging, since new computing paradigms such as Internet of Things, Cyber-Physical Systems and Edge Computing are spreading. One of the key aspects to achieve efficiency in data centres consists of the discovery and proper analysis of the data-centre behaviour. In this paper, we present a model to automatically retrieve execution workflows of existing data-centre logs by employing process mining techniques. The discovered processes are characterised and analysed according to the understandability and complexity in terms of execution efficiency of data-centre jobs. We finally validate and demonstrate the usability of the proposal by applying the model in a real scenario, that is, the Google Cluster traces.

[1]  Weisong Shi,et al.  Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.

[2]  Helen D. Karatza,et al.  Performance and cost evaluation of Gang Scheduling in a Cloud Computing system with job migrations and starvation handling , 2011, 2011 IEEE Symposium on Computers and Communications (ISCC).

[3]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[4]  Arthur H. M. ter Hofstede,et al.  Filtering Out Infrequent Behavior from Business Process Event Logs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[5]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[6]  Jan Mendling,et al.  Metrics for Process Models: Empirical Foundations of Verification, Error Prediction, and Guidelines for Correctness , 2008, Lecture Notes in Business Information Processing.

[7]  Massimo Mecella,et al.  Automated Discovery of Process Models from Event Logs: Review and Benchmark , 2017, IEEE Transactions on Knowledge and Data Engineering.

[8]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[9]  Wil M. P. van der Aalst Analyzing “Lasagna Processes” , 2011 .

[10]  Sheng Di,et al.  Characterization and Comparison of Cloud versus Grid Workloads , 2012, 2012 IEEE International Conference on Cluster Computing.

[11]  Moe Thandar Wynn,et al.  An Extensible Framework for Analysing Resource Behaviour Using Event Logs , 2014, CAiSE.

[12]  Damián Fernández-Cerero,et al.  Security supportive energy-aware scheduling and energy policies for cloud environments , 2018, J. Parallel Distributed Comput..

[13]  Abhishek Verma,et al.  Large-scale cluster management at Google with Borg , 2015, EuroSys.

[14]  Franck Cappello,et al.  Characterizing Cloud Applications on a Google Data Center , 2013, 2013 42nd International Conference on Parallel Processing.

[15]  Damián Fernández-Cerero,et al.  Stackelberg Game-Based Models In Energy-Aware Cloud Scheduling , 2018, ECMS.

[16]  Carlo Curino,et al.  Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters , 2015, USENIX Annual Technical Conference.

[17]  Wil M. P. van der Aalst,et al.  Process Mining Applied to the Test Process of Wafer Scanners in ASML , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  Chita R. Das,et al.  Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[19]  Ellen R. Girden,et al.  ANOVA: Repeated Measures , 1991 .

[20]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[21]  Zhen Xiao,et al.  Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment , 2013, IEEE Transactions on Parallel and Distributed Systems.

[22]  María Teresa Gómez López,et al.  Tactical Business-Process-Decision Support based on KPIs Monitoring and Validation , 2018, Comput. Ind..

[23]  Sander J. J. Leemans,et al.  Scalable Process Discovery with Guarantees , 2015, BMMDS/EMMSAD.

[24]  Robert N. M. Watson,et al.  Firmament: Fast, Centralized Cluster Scheduling at Scale , 2016, OSDI.

[25]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[26]  Ilija Cosic,et al.  Business Process Mining Application: A Literature Review , 2018 .

[27]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[28]  Wil M. P. van der Aalst,et al.  Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital , 2008, BIOSTEC.

[29]  María Teresa Gómez López,et al.  Process Mining to Unleash Variability Management: Discovering Configuration Workflows Using Logs , 2019, SPLC.

[30]  Sy-Yen Kuo,et al.  Dependability in Cyber-Physical Systems and Applications , 2019, ACM Trans. Cyber Phys. Syst..

[31]  Bianca Schroeder,et al.  Learning from Failure Across Multiple Clusters: A Trace-Driven Approach to Understanding, Predicting, and Mitigating Job Terminations , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[32]  Wil M. P. van der Aalst,et al.  Enabling process mining on sensor data from smart products , 2016, 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS).

[33]  Raji Ghawi,et al.  Process Discovery using Inductive Miner and Decomposition , 2016, ArXiv.

[34]  Kento Aida,et al.  Towards Understanding the Usage Behavior of Google Cloud Users: The Mice and Elephants Phenomenon , 2014, 2014 IEEE 6th International Conference on Cloud Computing Technology and Science.

[35]  Xiaohong Jiang,et al.  Live Migration of Multiple Virtual Machines with Resource Reservation in Cloud Computing Environments , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[36]  Wil M. P. van der Aalst,et al.  Analyzing “Spaghetti Processes” , 2011 .

[37]  Lua Perimal-Lewis,et al.  Application of process mining to assess the data quality of routinely collected time-based performance data sourced from electronic health records by validating process conformance , 2016, Health Informatics J..

[38]  Wil M. P. van der Aalst,et al.  Discovering more precise process models from event logs by filtering out chaotic activities , 2017, Journal of Intelligent Information Systems.

[39]  Wil M. P. van der Aalst,et al.  Process Mining , 2016, Springer Berlin Heidelberg.

[40]  Mario Piattini,et al.  Business process model refactoring applying IBUPROFEN. An industrial evaluation , 2019, J. Syst. Softw..

[41]  Jorge Cardoso,et al.  Control-flow Complexity Measurement of Processes and Weyuker's Properties , 2007 .

[42]  Zarina Shukur,et al.  Detecting Abnormal Behavior in Social Network Websites by using a Process Mining Technique , 2014, J. Comput. Sci..

[43]  Kishor S. Trivedi,et al.  Characterizing machines lifecycle in Google data centers , 2018, Perform. Evaluation.

[44]  Raja Lavanya,et al.  Fog Computing and Its Role in the Internet of Things , 2019, Advances in Computer and Electrical Engineering.

[45]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[46]  Wil M. P. van der Aalst,et al.  Process Mining: Discovering Direct Successors in Process Logs , 2002, Discovery Science.

[47]  Laurent Lefèvre,et al.  Quality of Cloud Services Determined by the Dynamic Management of Scheduling Models for Complex Heterogeneous Workloads , 2018, 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC).

[48]  Rajkumar Buyya,et al.  Energy-Efficient Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges , 2010, PDPTA.

[49]  Christoforos E. Kozyrakis,et al.  Improving Resource Efficiency at Scale with Heracles , 2016, ACM Trans. Comput. Syst..

[50]  Jun Yan,et al.  A Network-aware Virtual Machine Placement and Migration Approach in Cloud Computing , 2010, 2010 Ninth International Conference on Grid and Cloud Computing.

[51]  Dharmesh Kakadia,et al.  Virtualization vs Containerization to Support PaaS , 2014, 2014 IEEE International Conference on Cloud Engineering.

[52]  Sangyeun Cho,et al.  Characterizing Machines and Workloads on a Google Cluster , 2012, 2012 41st International Conference on Parallel Processing Workshops.

[53]  Ali Anwar,et al.  Analyzing Alibaba’s Co-located Datacenter Workloads , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[54]  N. R. T. P. van Beest,et al.  Redesigning business processes: a methodology based on simulation and process mining techniques , 2009, Knowledge and Information Systems.

[55]  Damián Fernández-Cerero,et al.  Energy policies for data-center monolithic schedulers , 2018, Expert Syst. Appl..

[56]  Jan Mendling,et al.  Metrics for Business Process Models , 2008 .