Model provenance tracking and inference for integrated environmental modelling

Integrated environmental modeling (IEM) provides a systematic way to couple models for integrated analysis. Coupled models in IEM often exchange data at runtime for time-step based executions. It is a challenge to track which raw observations or intermediate data exchanged at runtime contribute to individual model outputs. Time-step level provenance is needed to audit the trail of model execution or perform diagnosis in case of anomalies. This paper introduces a method to support provenance awareness in IEM. It suggests that individual models should expose necessary interfaces for provenance capturing in IEM environments. The provenance is represented using the W3C PROV model for interoperability. Fine-grained provenance is inferred based on coarse-grained provenance and temporal characteristics of computations of numerical time marching models. The approach is implemented in OpenMI-compliant models. A case study of model provenance tracking and inference on the watershed runoff simulation scenario illustrates the applicability of the approach. Model provenance tracking in integrated environmental modelling.Inference for fine-grained model provenance.Provenance awareness for OpenMI.

[1]  Peng Yue,et al.  A geoprocessing workflow system for environmental monitoring and integrated modelling , 2015, Environ. Model. Softw..

[2]  P. Boswell Architecture of the Earth , 1939, Nature.

[3]  Deborah L. McGuinness,et al.  Provenance Representation for the National Climate Assessment in the Global Change Information System , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Joseph Sifakis,et al.  Composition for component-based modeling , 2002, Sci. Comput. Program..

[5]  James Cheney,et al.  Provenance in databases , 2009, SIGMOD '07.

[6]  Carlos Granell,et al.  Seeing the forest through the trees: A review of integrated environmental modelling tools , 2013, Comput. Environ. Urban Syst..

[7]  Yolanda Gil,et al.  PROV-DM: The PROV Data Model , 2013 .

[8]  Peng Yue,et al.  Towards intelligent GIServices , 2015, Earth Science Informatics.

[9]  Stephan Mäs,et al.  Provenance Information in Geodata Infrastructures , 2013, AGILE Conf..

[10]  J. B. Gregersen,et al.  OpenMI: Open modelling interface , 2007 .

[11]  Andreas Wombacher,et al.  Adaptive Inference of Fine-grained Data Provenance to Achieve High Accuracy at Lower Storage Costs , 2011, 2011 IEEE Seventh International Conference on eScience.

[12]  Luc Moreau,et al.  The Foundations for Provenance on the Web , 2010, Found. Trends Web Sci..

[13]  N Oreskes,et al.  Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences , 1994, Science.

[14]  Liping Di,et al.  Sharing geospatial provenance in a service-oriented environment , 2011, Comput. Environ. Urban Syst..

[15]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[16]  Ilkay Altintas,et al.  Provenance Collection Support in the Kepler Scientific Workflow System , 2006, IPAW.

[17]  Peng Yue,et al.  An SDI Approach for Big Data Analytics: The Case on Sensor Web Event Detection and Geoprocessing Workflow , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[18]  Roger Moore,et al.  An overview of the open modelling interface and environment (the OpenMI) , 2005 .

[19]  Stefano Nativi,et al.  Environmental model access and interoperability: The GEO Model Web initiative , 2013, Environ. Model. Softw..

[20]  Liping Di,et al.  Adding Geospatial Data Provenance into SDI—A Service-Oriented Approach , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[21]  Anthony M. Castronova,et al.  Integrated modeling within a Hydrologic Information System: An OpenMI based approach , 2013, Environ. Model. Softw..

[22]  Scott D. Peckham,et al.  A component-based approach to integrated modeling in the geosciences: The design of CSDMS , 2013, Comput. Geosci..

[23]  Peng Yue,et al.  A Linked Data Approach for Geospatial Data Provenance , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Alexey A. Voinov,et al.  'Integronsters', integral and integrated modeling , 2013, Environ. Model. Softw..

[25]  Liping Di,et al.  Geoscience Data Provenance: An Overview , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Laura Díaz,et al.  Service-oriented applications for environmental models: Reusable geospatial services , 2010, Environ. Model. Softw..

[27]  Peng Yue,et al.  Linked Data and SDI: The case on Web geoprocessing workflows , 2016 .

[28]  Gary N. Geller,et al.  The model web: a concept for ecological forecasting , 2007, 2007 IEEE International Geoscience and Remote Sensing Symposium.

[29]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[30]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[31]  James Arthur Kohl,et al.  A Component Architecture for High-Performance Scientific Computing , 2006, Int. J. High Perform. Comput. Appl..

[32]  Mary C. Hill,et al.  Integrated environmental modeling: A vision and roadmap for the future , 2013, Environ. Model. Softw..

[33]  Anthony M. Castronova,et al.  Models as web services using the Open Geospatial Consortium (OGC) Web Processing Service (WPS) standard , 2013, Environ. Model. Softw..

[34]  Peng Yue,et al.  Granularity of geospatial data provenance , 2014, 2014 IEEE Geoscience and Remote Sensing Symposium.

[35]  Robert M. Argent,et al.  An overview of model integration for environmental applications--components, frameworks and semantics , 2004, Environ. Model. Softw..

[36]  Cecelia DeLuca,et al.  The architecture of the Earth System Modeling Framework , 2003, Computing in Science & Engineering.

[37]  Joseph Sifakis,et al.  Composition for component-based modeling , 2005, Sci. Comput. Program..

[38]  Yuanzheng Shao,et al.  Implementation of Geospatial Data Provenance in a Web Service Workflow Environment With ISO 19115 and ISO 19115-2 Lineage Model , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[39]  Z. Samani,et al.  Estimating Potential Evapotranspiration , 1982 .

[40]  Barbara Hofer,et al.  Geospatial Cyberinfrastructure and Geoprocessing Web - A Review of Commonalities and Differences of E-Science Approaches , 2013, ISPRS Int. J. Geo Inf..

[41]  K. Beven,et al.  A physically based, variable contributing area model of basin hydrology , 1979 .

[42]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[43]  Yogesh L. Simmhan,et al.  Guest Editorial: Scientific Workflows, Provenance and Their Applications , 2011, Int. J. Comput. Their Appl..

[44]  Michael Stonebraker,et al.  Supporting fine-grained data lineage in a database visualization environment , 1997, Proceedings 13th International Conference on Data Engineering.

[45]  Luc Moreau,et al.  Stream ancestor function: A mechanism for fine-grained provenance in stream processing systems , 2012, 2012 Sixth International Conference on Research Challenges in Information Science (RCIS).