论文信息 - Towards Automatic Generation of Semantic Types in Scientific Workflows

Towards Automatic Generation of Semantic Types in Scientific Workflows

Scientific workflow systems are problem-solving environments that allow scientists to automate and reproduce data management and analysis tasks. Workflow components include actors (e.g., queries, transformations, analyses, simulations, visualizations), and datasets which are produced and consumed by actors. The increasing number of such components creates the problem of discovering suitable components and of composing them to form the desired scientific workflow. In previous work we proposed the use of semantic types (annotations relative to an ontology) to solve these problems. Since creating semantic types can be complex and time-consuming, scalability of the approach becomes an issue. In this paper we propose a framework to automatically derive semantic types from a (possibly small) number of initial types. Our approach propagates the given semantic types through workflow steps whose input and output data structures are related via query expressions. By propagating semantic types, we can significantly reduce the effort required to annotate datasets and components and even derive new “candidate axioms” for inclusion in annotation ontologies.

Bertram Ludäscher | Shawn Bowers

[1] M. Willig,et al. Scale dependence in the species-richness-productivity relationship: The role of species turnover , 2004 .

[2] V. Vianu,et al. Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[3] 1 Model Based Mediation With Domain Maps , 2002 .

[4] Edward A. Lee,et al. Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[5] Bertram Ludäscher,et al. Actor-Oriented Design of Scientific Workflows , 2005, ER.

[6] Maurizio Lenzerini,et al. Data integration: a theoretical perspective , 2002, PODS.

[7] Bertram Ludäscher,et al. An Ontology-Driven Framework for Data Transformation in Scientific Workflows , 2004, DILS.

[8] Wang Chiew Tan,et al. An annotation management system for relational databases , 2004, The VLDB Journal.

[9] Jing Tao,et al. Incorporating Semantics in Scientific Workflow Authoring , 2005, SSDBM.

[10] Joachim Biskup,et al. A New Approach to Inferences of Semantic Constraints , 1997, ADBIS.

[11] Keith L. Clark,et al. Negation as Failure , 1987, Logic and Data Bases.

[12] Serge Abiteboul,et al. Foundations of Databases , 1994 .

[13] Edward A. Lee,et al. Dataflow process networks , 2001 .