Point-of-Interest (POI) Data Validation Methods: An Urban Case Study

Point-of-interest (POI) data from map sources are increasingly used in a wide range of applications, including real estate, land use, and transport planning. However, uncertainties in data quality arise from the fact that some of this data are crowdsourced and proprietary validation workflows lack transparency. Comparing data quality between POI sources without standardized validation metrics is a challenge. This study reviews and implements the available POI validation methods, working towards identifying a set of metrics that is applicable across datasets. Twenty-three validation methods were found and categorized. Most methods evaluated positional accuracy, while logical consistency and usability were the least represented. A subset of nine methods was implemented to assess four real-world POI datasets extracted for a highly urbanized neighborhood in Singapore. The datasets were found to have poor completeness with errors of commission and omission, although spatial errors were reasonably low (<60 m). Thematic accuracy in names and place types varied. The move towards standardized validation metrics depends on factors such as data availability for intrinsic or extrinsic methods, varying levels of detail across POI datasets, the influence of matching procedures, and the intended application of POI data.

[1]  Rodolphe Devillers,et al.  Improving Volunteered Geographic Information Quality Using a Tag Recommender System: The Case of OpenStreetMap , 2015, OpenStreetMap in GIScience.

[2]  Harvey J. Miller,et al.  Geographic Information Systems for Transportation in the 21st Century , 2015 .

[3]  Peter Vortisch,et al.  Using OpenStreetMap as a Data Source for Attractiveness in Travel Demand Models , 2021, Transportation Research Record: Journal of the Transportation Research Board.

[4]  Alexander Zipf,et al.  A taxonomy of quality assessment methods for volunteered and crowdsourced geographic information , 2018, Trans. GIS.

[5]  Serena Coetzee,et al.  A Contributor-Focused Intrinsic Quality Assessment of OpenStreetMap in Mozambique Using Unsupervised Machine Learning , 2021, ISPRS Int. J. Geo Inf..

[6]  Jaiteg Singh,et al.  Assessing OpenStreetMap Data Using Intrinsic Quality Indicators: An Extension to the QGIS Processing Toolbox , 2017, Future Internet.

[7]  Alexander Zipf,et al.  Graph-Based Matching of Points-of-Interest from Collaborative Geo-Datasets , 2018, ISPRS Int. J. Geo Inf..

[8]  Dieter Pfoser,et al.  Using OpenStreetMap point-of-interest data to model urban change—A feasibility study , 2019, PloS one.

[9]  Peter Vortisch,et al.  Analyzing OpenStreetMap as data source for travel demand models A case study in Karlsruhe , 2019 .

[10]  Hui Xiong,et al.  Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering , 2014, KDD.

[11]  Hansi Senaratne,et al.  A review of volunteered geographic information quality assessment methods , 2017, Int. J. Geogr. Inf. Sci..

[12]  Pascal Neis,et al.  Assessing the Effect of Data Imports on the Completeness of OpenStreetMap – A United States Case Study , 2013, Trans. GIS.

[13]  Abbas Rajabifard,et al.  Automatic analysis of positional plausibility for points of interest in OpenStreetMap using coexistence patterns , 2019, Int. J. Geogr. Inf. Sci..

[14]  Ahmed Loai Ali,et al.  Data Quality Assurance for Volunteered Geographic Information , 2014, GIScience.

[15]  Yong Wang,et al.  Point of Interest Matching between Different Geospatial Datasets , 2019, ISPRS Int. J. Geo Inf..

[16]  Byron Nakos,et al.  Assessment and Visualization of OSM Consistency for European Cities , 2021, ISPRS Int. J. Geo Inf..

[17]  Yu Liu,et al.  Inferring trip purposes and uncovering travel patterns from taxi trajectory data , 2016 .

[18]  Giovanni Quattrone,et al.  The Impact of Society on Volunteered Geographic Information: The Case of OpenStreetMap , 2015, OpenStreetMap in GIScience.

[19]  G. Sithole,et al.  Assessing the Quality of OpenStreetMap Data in South Africa in Reference to National Mapping Standards , 2014 .

[20]  Lynette Cheah,et al.  An End-to-end Point of Interest (POI) Conflation Framework , 2021, ISPRS Int. J. Geo Inf..

[21]  Hartwig H. Hochmair,et al.  Data Quality of Points of Interest in Selected Mapping and Social Media Platforms , 2018, LBS.

[22]  M. Haklay How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets , 2010 .

[23]  Gloria Bordogna,et al.  On predicting and improving the quality of Volunteer Geographic Information projects , 2016, Int. J. Digit. Earth.

[24]  Guillaume Touya,et al.  Quality Assessment of the French OpenStreetMap Dataset , 2010, Trans. GIS.

[25]  Giles M. Foody,et al.  Crowdsourced geospatial data quality: challenges and future directions , 2019, Int. J. Geogr. Inf. Sci..

[26]  Anthony Stefanidis,et al.  Assessing Completeness and Spatial Error of Features in Volunteered Geographic Information , 2013, ISPRS Int. J. Geo Inf..

[27]  Guillaume Touya,et al.  Assessing Crowdsourced POI Quality: Combining Methods Based on Reference Data, History, and Spatial Relations , 2017, ISPRS Int. J. Geo Inf..

[28]  Aleksander Smywinski-Pohl,et al.  Towards Automatic Points of Interest Matching , 2020, ISPRS Int. J. Geo Inf..

[29]  Filipe Rodrigues,et al.  Estimating Disaggregated Employment Size from Points-of-Interest and Census Data: From Mining the Web to Model Implementation and Visualization , 2013 .

[30]  MohammadReza Malek,et al.  Artificial intelligence-based solution to estimate the spatial accuracy of volunteered geographic data , 2015 .

[31]  Dennis Zielstra,et al.  Development and Completeness of Points Of Interest in Free and Proprietary Data Sets: A Florida Case Study , 2013 .

[32]  Xudong Liu,et al.  Identification of Urban Functional Regions in Chengdu Based on Taxi Trajectory Time Series Data , 2020, ISPRS Int. J. Geo Inf..

[33]  Geoff Boeing,et al.  Spatial Information and the Legibility of Urban Form: Big Data in Urban Morphology , 2019, Int. J. Inf. Manag..

[34]  Pascal Neis,et al.  Quality assessment for building footprints data on OpenStreetMap , 2014, Int. J. Geogr. Inf. Sci..

[35]  Alexander Zipf,et al.  Defining Fitness-for-Use for Crowdsourced Points of Interest (POI) , 2016, ISPRS Int. J. Geo Inf..

[36]  P. Dixon Ripley's K Function , 2006 .

[37]  Kajal T. Claypool,et al.  QMatch - Using paths to match XML schemas , 2007, Data Knowl. Eng..

[38]  Peter Mooney,et al.  Exploring Data Model Relations in OpenStreetMap , 2017, Future Internet.

[39]  Bin Jiang,et al.  Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information , 2016, ISPRS Int. J. Geo Inf..

[40]  David Fairbairn,et al.  Assessing similarity matching for possible integration of feature classifications of geospatial data from official and informal sources , 2012, Int. J. Geogr. Inf. Sci..

[41]  Eliseo Clementini,et al.  Data trustworthiness and user reputation as indicators of VGI quality , 2018, Geo spatial Inf. Sci..

[42]  Hans-Jörg Stark Quality Assessment of Volunteered Geographic Information using Open Web Map Services within OpenAddresses , 2011 .

[43]  Lucienne Blessing,et al.  A scalable Bluetooth Low Energy approach to identify occupancy patterns and profiles in office spaces , 2020 .

[44]  Lucy Bastin,et al.  Assessing VGI Data Quality , 2017 .

[45]  Pascal Neis,et al.  A Comprehensive Framework for Intrinsic OpenStreetMap Quality Analysis , 2014, Trans. GIS.

[46]  Karl Rehrl,et al.  Estimating Completeness of VGI Datasets by Analyzing Community Activity Over Time Periods , 2014, AGILE Conf..

[47]  Antoon Bronselaer,et al.  Consistently Handling Geographical User Data - Context-Dependent Detection of Co-located POIs , 2010, IPMU.