Data Quality Assurance for Volunteered Geographic Information

The availability of technology and tools enables the public to participate in the collection, contribution, editing, and usage of geographic information, a domain previously reserved for mapping agencies or companies. The data of Volunteered Geographic Information (VGI) systems, such as OpenStreetMap (OSM), is based on the availability of technology and participation of individuals. However, this combination also implies quality issues related to the data: some of the contributed entities can be assigned to wrong or implausible classes, due to individual interpretation of the submitted data, or due to misunderstanding about available classes. In this paper we propose two methods to check the integrity of VGI data with respect to hierarchical consistency and classification plausibility. These methods are based on constraint checking and machine learning methods. They can be used to check the validity of data during contribution or at a later stage for collaborative manual or automatic data correction.

[1]  Pascal Neis,et al.  Comparison of Volunteered Geographic Information Data Contributions and Community Development for Selected World Regions , 2013, Future Internet.

[2]  Angi Voß,et al.  A Comparison of the Street Networks of Navteq and OSM in Germany , 2011, AGILE Conf..

[3]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[4]  Christian Freksa,et al.  What You See is What You Map: Geometry-Preserving Micro-Mapping for Smaller Geographic Objects with mapIT , 2013, AGILE Conf..

[5]  Pascal Neis,et al.  The Street Network Evolution of Crowdsourced Maps: OpenStreetMap in Germany 2007-2011 , 2011, Future Internet.

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Peter Mooney,et al.  The Annotation Process in OpenStreetMap , 2012, Trans. GIS.

[8]  Michael F. Goodchild,et al.  Assuring the quality of volunteered geographic information , 2012 .

[9]  David Coleman,et al.  Volunteered Geographic Information: the nature and motivation of produsers , 2009, Int. J. Spatial Data Infrastructures Res..

[10]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[11]  S. Gorman,et al.  Volunteered Geographic Information and Crowdsourcing Disaster Relief: A Case Study of the Haitian Earthquake , 2010 .

[12]  Guillaume Touya,et al.  Quality Assessment of the French OpenStreetMap Dataset , 2010, Trans. GIS.

[13]  Stefano Spaccapietra,et al.  On Spatial Database Integration , 1998, Int. J. Geogr. Inf. Sci..

[14]  Rodolphe Devillers,et al.  IMPROVING VOLUNTEERED GEOGRAPHIC DATA QUALITY USING SEMANTIC SIMILARITY MEASUREMENTS , 2013 .

[15]  Pascal Neis,et al.  Analyzing the Contributor Activity of a Volunteered Geographic Information Project - The Case of OpenStreetMap , 2012, ISPRS Int. J. Geo Inf..

[16]  Pascal Neis,et al.  A Comprehensive Framework for Intrinsic OpenStreetMap Quality Analysis , 2014, Trans. GIS.

[17]  M. Haklay How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets , 2010 .

[18]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[19]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[20]  Krzysztof Janowicz,et al.  Analyzing the Spatial-Semantic Interaction of Points of Interest in Volunteered Geographic Information , 2011, COSIT.

[21]  Lutz Frommberger,et al.  Lowering the barrier: how the what-you-see-is-what-you-map paradigm enables people to contribute volunteered geographic information , 2013, ACM DEV-4 '13.

[22]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[23]  Stephan Winter,et al.  Spatial Information Theory, 8th International Conference, COSIT 2007, Melbourne, Australia, September 19-23, 2007, Proceedings , 2007, COSIT.

[24]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[25]  Ian Witten,et al.  Data Mining , 2000 .

[26]  M. Goodchild,et al.  Researching Volunteered Geographic Information: Spatial Data, Geographic Research, and New Social Practice , 2012 .

[27]  Mauricio Giraldo Arteaga Historical map polygon and feature extractor , 2013, MapInteract '13.