Workflow matching using semantic metadata

Workflows are becoming an increasingly more common paradigm to manage scientific analyses. As workflow repositories start to emerge, workflow retrieval and discovery becomes a challenge. Studies have shown that scientists wish to discover workflows given properties of workflow data inputs, intermediate data products, and data results. However, workflows typically lack this information when contributed to a repository. Our work addresses this issue by augmenting workflow descriptions with constraints derived from properties about the workflow components used to process data as well as the data itself. An important feature of our approach is that it assumes that component and data properties are obtained from catalogs that are external to the workflow system, consistent with current architectures for computational science.

[1]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[2]  Kalyan Moy Gupta,et al.  Exploiting Taxonomic and Causal Relations in Conversational Case Retrieval , 2002, ECCBR.

[3]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[4]  Carole Goble,et al.  Benchmarking workflow discovery: a case study from bioinformatics , 2009 .

[5]  James A. Hendler,et al.  A Validation-Structure-Based Theory of Plan Modification and Reuse , 1992, Artif. Intell..

[6]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[7]  David B. Leake,et al.  Towards Case-Based Support for e-Science Workflow Generation by Mining Provenance , 2008, ECCBR.

[8]  Drew McDermott,et al.  Regression planning , 1991, Int. J. Intell. Syst..

[9]  Yolanda Gil,et al.  Provenance trails in the Wings-Pegasus system , 2008 .

[10]  Carole A. Goble,et al.  Seven Bottlenecks to Workflow Reuse and Repurposing , 2005, International Semantic Web Conference.

[11]  Ian Witten,et al.  Data Mining , 2000 .

[12]  Carole A. Goble,et al.  Workflow discovery: the problem, a case study from e-Science and a graph-based solution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[13]  Manuela M. Veloso,et al.  Planning and Learning by Analogical Reasoning , 1994, Lecture Notes in Computer Science.

[14]  Yolanda Gil,et al.  Wings for Pegasus: Creating Large-Scale Scientific Applications Using Semantic Representations of Computational Workflows , 2007, AAAI.

[15]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[16]  Ian Horrocks,et al.  A Software Framework for Matchmaking Based on Semantic Web Technology , 2004, Int. J. Electron. Commer..

[17]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[18]  Ian Horrocks,et al.  Deciding Semantic Matching of Stateless Services , 2006, AAAI.

[19]  Nature Genetics , 1991, Nature.

[20]  Yogesh L. Simmhan,et al.  Special Issue: The First Provenance Challenge , 2008, Concurr. Comput. Pract. Exp..

[21]  Carole Goble,et al.  Discovering Scientific Workflows: The myExperiment Benchmarks , 2008 .