Assisting Scientists with Complex Data Analysis Tasks through Semantic Workflows

To assist scientists in data analysis tasks, we have developed semantic workflow representations that support automatic constraint propagation and reasoning algorithms to manage constraints among the individual workflow steps. Semantic constraints can be used to represent requirements of input datasets as well as best practices for the method represented in a workflow. We demonstrate how the Wings workflow system uses semantic workflows to assist users in creating workflows while validating that the workflows comply with the requirements of the software components and datasets. Wings reasons over semantic workflow representations that consist of both a traditional dataflow graph as well as a network of constraints on the data and components of the workflow.

[1]  Paul T. Groth,et al.  Wings: Intelligent Workflow-Based Design of Computational Experiments , 2011, IEEE Intelligent Systems.

[2]  Yolanda Gil,et al.  Provenance trails in the Wings-Pegasus system , 2008 .

[3]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[4]  Yolanda Gil,et al.  A semantic framework for automatic generation of computational workflows using distributed data and component catalogues , 2011, J. Exp. Theor. Artif. Intell..

[5]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[6]  Yolanda Gil,et al.  Workflow matching using semantic metadata , 2009, K-CAP '09.

[7]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[8]  John M. Chambers,et al.  Programming With Data , 1998 .

[9]  Yolanda Gil,et al.  Wings for Pegasus: Creating Large-Scale Scientific Applications Using Semantic Representations of Computational Workflows , 2007, AAAI.

[10]  Yolanda Gil,et al.  From data to knowledge to discoveries: Artificial intelligence and scientific workflows , 2009, Sci. Program..

[11]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[12]  Jun Ma,et al.  Computational Workflows for Assessing Student Learning , 2010, Intelligent Tutoring Systems.

[13]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[14]  Paul T. Groth,et al.  Expressive Reusable Workflow Templates , 2009, 2009 Fifth IEEE International Conference on e-Science.

[15]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[16]  Joel H. Saltz,et al.  Parameterized specification, configuration and execution of data-intensive scientific workflows , 2010, Cluster Computing.

[17]  Yolanda Gil From data to knowledge to discoveries: Artificial intelligence and scientific workflows , 2009 .

[18]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.