Workflow Analysis using Graph Kernels

Workflow enacting systems are a popular technology in business and e-science alike to flexibly define and enact complex data processing tasks. Since the construction of a workflow for a specific task can become quite complex, efforts are currently underway to increase the re-use of workflows through the implementation of specialized workflow repositories. While existing methods to exploit the knowledge in these repositories usually consider workflows as an atomic entity, our work is based on the fact that workflows can naturally be viewed as graphs. Hence, in this paper we investigate the use of graph kernels for the problems of workflow discovery, workflow recommendation, and workflow pattern extraction, paying special attention on the typical situation of few labeled and many unlabeled workflows. To empirically demonstrate the feasibility of our approach we investigate a dataset of bioinformatics workflows retrieved from the website myexperiment.org. 1

[1]  Carole A. Goble,et al.  Benchmarking workflow discovery: a case study from bioinformatics , 2009, Concurr. Comput. Pract. Exp..

[2]  Carole A. Goble,et al.  myExperiment: Defining the Social Virtual Research Environment , 2008, 2008 IEEE Fourth International Conference on eScience.

[3]  Carole A. Goble,et al.  Workflow discovery: the problem, a case study from e-Science and a graph-based solution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[4]  Stewart S. Miller Parallel Databases , 2001, High-Performance Web Databases.

[5]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[6]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[7]  Carole Goble,et al.  Benchmarking workflow discovery: a case study from bioinformatics , 2009 .

[8]  Michael A. Fraser Virtual Research Environments: Overview and Activity , 2005 .

[9]  Achim G. Hoffmann,et al.  Proceedings of the Nineteenth International Conference on Machine Learning , 2002 .

[10]  John Mylopoulos,et al.  The Semantic Web - ISWC 2003 , 2003, Lecture Notes in Computer Science.

[11]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[12]  Lucinéia Heloisa Thom,et al.  Workflow Patterns for Business Process Modeling , 2007 .

[13]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[14]  Daniela Grigori,et al.  BPEL Processes Matchmaking for Service Discovery , 2006, OTM Conferences.

[15]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[16]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[17]  Abraham Bernstein,et al.  Towards cooperative planning of data mining workflows , 2009 .

[18]  John C. Reynolds,et al.  School of computer science , 1988 .

[19]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[20]  Carole A. Goble,et al.  The design and realisation of the myExperiment Virtual Research Environment for social sharing of workflows , 2009, Future Gener. Comput. Syst..

[21]  Carole A. Goble,et al.  Seven Bottlenecks to Workflow Reuse and Repurposing , 2005, International Semantic Web Conference.

[22]  Hisashi Kashima,et al.  Kernels for Semi-Structured Data , 2002, ICML.

[23]  M. Hilario,et al.  A Data Mining Ontology for Algorithm Selection and Meta-Mining , 2009 .