Scientific Workflow Protocol Discovery from Public Event Logs in Clouds

With the advancement of cloud computing, many challenging scientific problems can be solved using scientific workflow technology which integrates geo-distributed instruments, applications, and big data effectively and efficiently. For workflow collaboration, the workflow protocols of all participants are needed. However, workflow protocols are not always available and are often outdated as the workflow evolve frequently. To address this problem, we propose a novel workflow discovery approach which can extract up-to-date scientific workflow protocols from public event logs in clouds, without the need to access the full-fledged event logs involving private events. Our approach leverages transitive precedence relations between events to achieve this. We implement our approach as a ProM plug-in, and evaluate it through extensive experiments on event logs of real-world scientific workflows. The experimental results demonstrate that our approach requires a weaker completeness notion of event logs than the state-of-the-art do, and our approach derives the same workflow protocol from the public event log as that discovered from the original event log, and thus the private events can be protected.

[1]  Jinjun Chen,et al.  HireSome-II: Towards Privacy-Aware Cross-Cloud Service Composition for Big Data Applications , 2015, IEEE Transactions on Parallel and Distributed Systems.

[2]  Shing-Chi Cheung,et al.  Atomicity Analysis of Service Composition across Organizations , 2009, IEEE Transactions on Software Engineering.

[3]  Boudewijn F. van Dongen,et al.  Process mining: a two-step approach to balance between underfitting and overfitting , 2008, Software & Systems Modeling.

[4]  Carole A. Goble,et al.  Benchmarking workflow discovery: a case study from bioinformatics , 2009, Concurr. Comput. Pract. Exp..

[5]  Boudewijn F. van Dongen,et al.  Conformance Checking Using Cost-Based Fitness Analysis , 2011, 2011 IEEE 15th International Enterprise Distributed Object Computing Conference.

[6]  Wil M. P. van der Aalst,et al.  Genetic process mining: an experimental evaluation , 2007, Data Mining and Knowledge Discovery.

[7]  Hans-Arno Jacobsen,et al.  Whitening SOA Testing via Event Exposure , 2013, IEEE Transactions on Software Engineering.

[8]  Wil M. P. van der Aalst,et al.  Inheritance of workflows: an approach to tackling problems related to change , 2002 .

[9]  Josep Carmona,et al.  Region-Based Foldings in Process Discovery , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Wil M. P. van der Aalst,et al.  From Public Views to Private Views - Correctness-by-Design for Services , 2007, WS-FM.

[11]  Boudewijn F. van Dongen,et al.  Replaying history on process models for conformance checking and performance analysis , 2012, WIREs Data Mining Knowl. Discov..

[12]  Bart Baesens,et al.  A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs , 2012, Inf. Syst..

[13]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Hao Wu,et al.  Resource and Instance Hour Minimization for Deadline Constrained DAG Applications Using Computer Clouds , 2016, IEEE Transactions on Parallel and Distributed Systems.

[15]  Moe Thandar Wynn,et al.  Change your history: Learning from event logs to improve processes , 2015, 2015 IEEE 19th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[16]  Dan Zecha,et al.  SYNOPS - Generation of Partial Languages and Synthesis of Petri Nets , 2012, PNSE.

[17]  Wil M. P. van der Aalst,et al.  Conformance checking of processes based on monitoring real behavior , 2008, Inf. Syst..

[18]  Mathias Weske,et al.  Process compliance analysis based on behavioural profiles , 2011, Inf. Syst..

[19]  Hans-Arno Jacobsen,et al.  Static and Dynamic Process Change , 2018, IEEE Transactions on Services Computing.

[20]  Hans-Arno Jacobsen,et al.  Process Discovery from Dependence-Complete Event Logs , 2016, IEEE Transactions on Services Computing.

[21]  Bart Baesens,et al.  A comprehensive benchmarking framework (CoBeFra) for conformance analysis between procedural process models and event logs in ProM , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[22]  Cláudio T. Silva,et al.  Managing the Evolution of Dataflows with VisTrails , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[23]  Moe Thandar Wynn,et al.  Impact-Driven Process Model Repair , 2016, ACM Trans. Softw. Eng. Methodol..

[24]  Dirk Fahland,et al.  Model repair - aligning process models to reality , 2015, Inf. Syst..

[25]  Robin Bergenthum,et al.  Process Mining Based on Regions of Languages , 2007, BPM.

[26]  Debmalya Panigrahi,et al.  Provenance views for module privacy , 2010, PODS.

[27]  Gwen Salaün,et al.  Adaptation of Service Protocols Using Process Algebra and On-the-Fly Reduction Techniques , 2012, IEEE Transactions on Software Engineering.

[28]  Adriane Chapman,et al.  Surrogate Parenthood: Protected and Informative Graphs , 2011, Proc. VLDB Endow..

[29]  Bart Baesens,et al.  Improved Artificial Negative Event Generation to Enhance Process Event Logs , 2012, CAiSE.

[30]  Maria E. Orlowska,et al.  Analyzing Process Models Using Graph Reduction Techniques , 2000, Inf. Syst..

[31]  Bingsheng He,et al.  Transformation-Based Monetary CostOptimizations for Workflows in the Cloud , 2014, IEEE Transactions on Cloud Computing.

[32]  Jianmin Wang,et al.  Mining process models with prime invisible tasks , 2010, Data Knowl. Eng..

[33]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundam. Informaticae.

[34]  Qingtian Zeng,et al.  Towards Comprehensive Support for Privacy Preservation Cross-Organization Business Process Mining , 2019, IEEE Transactions on Services Computing.

[35]  Duen-Ren Liu,et al.  Workflow modeling for virtual processes: an order-preserving process-view approach , 2003, Inf. Syst..

[36]  Rik Eshuis,et al.  Service Outsourcing with Process Views , 2015, IEEE Transactions on Services Computing.

[37]  Qingsheng Zhu,et al.  Fluctuation-Aware and Predictive Workflow Scheduling in Cost-Effective Infrastructure-as-a-Service Clouds , 2018, IEEE Access.

[38]  Jian Lu,et al.  A Public-View Approach to Timed Properties Verification for B2B Web Service Compositions , 2009, 2009 IEEE International Conference on Services Computing.

[39]  Pengcheng Zhang,et al.  Efficient Alignment Between Event Logs and Process Models , 2017, IEEE Transactions on Services Computing.

[40]  Jörg Desel,et al.  Models from Scenarios , 2013, Trans. Petri Nets Other Model. Concurr..

[41]  Arnold L. Rosenberg,et al.  An AREA-Oriented Heuristic for Scheduling DAGs on Volatile Computing Platforms , 2015, IEEE Transactions on Parallel and Distributed Systems.

[42]  Stefan Felsner,et al.  The Complexity of the Partial Order Dimension Problem: Closing the Gap , 2015, SIAM J. Discret. Math..

[43]  Carole A. Goble,et al.  The design and realisation of the myExperiment Virtual Research Environment for social sharing of workflows , 2009, Future Gener. Comput. Syst..

[44]  Bart Baesens,et al.  Active Trace Clustering for Improved Process Discovery , 2013, IEEE Transactions on Knowledge and Data Engineering.

[45]  Gabriel Ghinita,et al.  Privacy-preserving publication of provenance workflows , 2014, CODASPY '14.

[46]  Boudewijn F. van Dongen,et al.  On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery , 2012, OTM Conferences.

[47]  Alfred V. Aho,et al.  The Transitive Reduction of a Directed Graph , 1972, SIAM J. Comput..

[48]  Rajkumar Buyya,et al.  Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication , 2014, IEEE Transactions on Parallel and Distributed Systems.

[49]  Wil M. P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[50]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[51]  Walid Gaaloul,et al.  Scientific Workflow Clustering and Recommendation Leveraging Layer Hierarchical Analysis , 2018, IEEE Transactions on Services Computing.