Quality-driven geospatial data integration

Accurate and efficient integration of geospatial data is an important problem with applications in areas such as emergency response and urban planning. Some of the key challenges in supporting large-scale geospatial data integration are automatically computing the quality of the data provided by a large number of geospatial sources and dynamically providing high quality answers to the user queries based on a quality criteria supplied by the user. We describe a framework called the Quality-driven Geospatial Mediator (QGM) that supports efficient and accurate integration of geospatial data from a large number of sources. The key contributions of our framework are: (1) the ability to automatically estimate the quality of data provided by a source by using the information from another source of known quality, (2) representing the quality of data provided by the sources in a declarative data integration framework, and (3) a query answering technique that exploits the quality information to provide high quality geospatial data in response to user queries. Our experimental evaluation using over 1200 real-world sources shows that QGM can accurately estimate the quality of geospatial sources. Moreover, QGM provides better quality data in response to the user queries compared to the traditional data integration systems and does so with lower response time.

[1]  Frederico T. Fonseca,et al.  Using Ontologies for Integrated Geographic Information Systems , 2002, Trans. GIS.

[2]  Felix Naumann From Databases to Information Systems - Information Quality Makes the Difference , 2001, IQ.

[3]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[4]  Michael R. Genesereth,et al.  Query planning and optimization in information integration , 1997 .

[5]  I. Zaslavsky,et al.  Grid-enabled mediation services for geospatial information , 2003 .

[6]  Laure Berti-Équille Integration of Biological Data and Quality-Driven Source Negotiation , 2001, ER.

[7]  Dumitru Roman,et al.  SWING - A Semantic Framework for Geospatial Services , 2007, The Geospatial Web.

[8]  Eliseo Clementini,et al.  Integration of Imperfect Spatial Information , 2001, J. Vis. Lang. Comput..

[9]  Subbarao Kambhampati,et al.  Optimizing Recursive Information Gathering Plans in EMERAC , 2004, Journal of Intelligent Information Systems.

[10]  Chaitanya K. Baru,et al.  Integrating GIS and Imagery Through XML-Based Information Mediation , 1999, Integrated Spatial Databases.

[11]  Amit P. Sheth,et al.  Geospatial Ontology Development and Semantic Analytics , 2006, Trans. GIS.

[12]  Robert Jeansoulin,et al.  Fundamentals of Spatial Data Quality (Geographical Information Systems series) , 2006 .

[13]  Felix Naumann,et al.  Quality-driven Integration of Heterogenous Information Systems , 1999, VLDB.

[14]  Robert Jeansoulin,et al.  Fundamentals of Spatial Data Quality , 2006 .

[15]  Sibel Adali,et al.  A uniform framework for integrating knowledge in heterogeneous knowledge systems , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Maria-Esther Vidal,et al.  Implementing a Bioinformatics Pipeline (BIP) on a Mediator Platform: Comparing Cost and Quality of Alternate Choices , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[17]  Yassine Lassoued,et al.  Query processing in a geographic mediation system , 2004, GIS '04.

[18]  Arno Scharl,et al.  The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society , 2007, The Geospatial Web.

[19]  Jennifer Widom,et al.  Integrating and Accessing Heterogeneous Information Sources in TSIMMIS , 1994 .

[20]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[21]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.