Towards provenance-aware geographic information systems

GIS (Geographic Information Systems) play an important role to acquire and communicate geospatial knowledge based on spatial data and the use of spatial analysis, modeling, and visualization. The assurance of the validity and quality of spatial data handling and analysis remains a great challenge, in part, because of sophisticated procedures are often required for collaborative geospatial problem-solving and decision making. These procedures, when specified as knowledge derivation workflows, require carefully configured parameters and spatiotemporal specifications guided by specific contexts and purposes. The information of spatial data lineage and related analysis workflow is defined as spatial provenance in this research. Such information is often not well recorded or managed during spatial data handling and related analysis. This paper presents a provenance-aware GIS architecture that incorporates spatial provenance to address this shortcoming and facilitate the assurance of validity and quality of spatial data handling and analysis. Spatial provenance in this architecture is generated and managed to allow queries on data lineage and workflow information to support geospatial problem-solving. Basic elements of spatial provenance are captured using a spatial provenance model. The illustration of the provenance-aware GIS architecture and its proof-of-concept implementation reveals the similarity and difference in the use of spatial provenance in GIS applications. Overall, the architecture and implementation described in the paper demonstrates the necessity and feasibility of introducing provenance into GIS.

[1]  Amit P. Sheth,et al.  Analyzing theme, space, and time: an ontology-based approach , 2006, GIS '06.

[2]  Shaowen Wang,et al.  GISolve: a grid-based problem solving environment for computationally intensive geographic information analysis , 2005, CLADE 2005. Proceedings Challenges of Large Applications in Distributed Environments, 2005..

[3]  James Frew,et al.  Lineage retrieval for scientific data processing: a survey , 2005, CSUR.

[4]  David P. Lanter A Lineage Meta-Database Approach Toward Spatial Analytic Database Optimization , 1993 .

[5]  Luc Moreau,et al.  Provenance and Annotation of Data, International Provenance and Annotation Workshop, IPAW 2006, Chicago, IL, USA, May 3-5, 2006, Revised Selected Papers , 2006, IPAW.

[6]  Shaowen Wang,et al.  Coupling Cyberinfrastructure and Geographic Information Systems to Empower Ecological and Environmental Research , 2008 .

[7]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[8]  Max J. Egenhofer,et al.  Toward the semantic geospatial web , 2002, GIS '02.

[9]  James D. Myers,et al.  Embedding Data within Knowledge Spaces , 2009, ArXiv.

[10]  Gustavo Alonso,et al.  Geo-Opera: Workflow Concepts for Spatial Processes , 1997, SSD.

[11]  Luc Moreau,et al.  The Open Provenance Model , 2007 .

[12]  Paul T. Groth,et al.  An Architecture for Provenance Systems , 2006 .

[13]  Amit P. Sheth,et al.  Semantic Provenance for eScience: Managing the Deluge of Scientific Data , 2008, IEEE Internet Computing.

[14]  D. Lanter Design of a Lineage-Based Meta-Data Base for GIS , 1991 .

[15]  Luc Moreau,et al.  Recording and Reasoning over Data Provenance in Web and Grid Services , 2003, OTM.

[16]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[17]  Michael F. Goodchild,et al.  Extending geographical representation to include fields of spatial objects , 2002, Int. J. Geogr. Inf. Sci..

[18]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[19]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[20]  Paul T. Groth,et al.  The provenance of electronic data , 2008, CACM.

[21]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[22]  Amit P. Sheth,et al.  Traveling the Semantic Web through Space, Time, and Theme , 2008, IEEE Internet Computing.