Quality of GNSS Traces from VGI: A Data Cleaning Method Based on Activity Type and User Experience

VGI (Volunteered Geographic Information) refers to spatial data collected, created, and shared voluntarily by users. Georeferenced tracks are one of the most common components of VGI, and, as such, are not free from errors. The cleaning of GNSS (Global Navigation Satellite System) tracks is usually based on the detection and removal of outliers using their geometric characteristics. However, according to our experience, user profile differentiation is still a novelty, and studies delving into the relationship between contributor efficiency, activity, and quality of the VGI produced are lacking. The aim of this study is to design a procedure to filter GNSS traces according to their quality, the type of activity pursued, and the contributor efficiency with VGI. Source data are obtained Wikiloc. The methodology includes tracks classification according mobility types, box plot analysis to identify outliers, bivariate user segmentation according to level of activity and efficiency, and the study of its spatial behavior using kernel-density maps. The results reveal that out of 44,326 tracks, 8096 (18.26%) are considered erroneous, mainly (73.02%) due to contributors’ poor practices and the remaining being due to bad GNSS reception. The results also show a positive correlation between data quality and the author’s efficiency collecting VGI.

[1]  Muljono,et al.  The Determination of Cluster Number at k-Mean Using Elbow Method and Purity Evaluation on Headline News , 2018, 2018 International Seminar on Application for Technology of Information and Communication.

[2]  David Serrano Giné,et al.  PPGIS and Public Use in Protected Areas: A Case Study in the Ebro Delta Natural Park, Spain , 2019, ISPRS Int. J. Geo Inf..

[3]  J. Gutiérrez,et al.  Identification of tourist hot spots based on social networks: A comparative analysis of European metropolises using photo-sharing services and GIS , 2015 .

[4]  P. Block,et al.  Optimal Cluster Analysis for Objective Regionalization of Seasonal Precipitation in Regions of High Spatial–Temporal Variability: Application to Western Ethiopia , 2016 .

[5]  Sébastien Mustière,et al.  A Filtering-Based Approach for Improving Crowdsourced GNSS Traces in a Data Update Context , 2019, ISPRS Int. J. Geo Inf..

[6]  Tarmo Virtanen,et al.  Smartphone GPS tracking—Inexpensive and efficient data collection on recreational movement , 2017 .

[7]  Marco Minghini,et al.  Towards a Protocol for the Collection of VGI Vector Data , 2016, ISPRS Int. J. Geo Inf..

[8]  Alenka Poplin,et al.  How user-friendly are online interactive maps? Survey based on experiments with heterogeneous users , 2015 .

[9]  Marketta Kyttä,et al.  Exploring the usability of PPGIS among older adults: challenges and opportunities , 2016, Int. J. Geogr. Inf. Sci..

[10]  Eliseo Clementini,et al.  Data trustworthiness and user reputation as indicators of VGI quality , 2018, Geo spatial Inf. Sci..

[11]  Determining the Best Clustering Number of K-Means Based on Bootstrap Sampling , 2018, 2018 2nd International Conference on Data Science and Business Analytics (ICDSBA).

[12]  Juha Oksanen,et al.  Conflation of OpenStreetMap and Mobile Sports Tracking Data for Automatic Bicycle Routing , 2016, Trans. GIS.

[13]  Michele Melchiori,et al.  Reputation evaluation of georeferenced data for crowd-sensed applications , 2017, ANT/SEIT.

[14]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[15]  Vladimir Usyukov,et al.  Methodology for identifying activities from GPS data streams , 2017, ANT/SEIT.

[16]  J. Gutiérrez,et al.  Using geotagged photographs and GPS tracks from social networks to analyse visitor behaviour in national parks , 2020, Current Issues in Tourism.

[17]  A. Alkerwi,et al.  Stability-based validation of dietary patterns obtained by cluster analysis , 2017, Nutrition Journal.

[18]  Onur Dogan,et al.  The ABCD typology: Profile and motivations of Turkish social network sites users , 2017, Comput. Hum. Behav..

[19]  M. Gómez-López,et al.  Perfiles motivacionales de usuarios de servicios deportivos públicos y privados , 2013 .

[20]  J. E. Rodríguez,et al.  Preprocesamiento de datos estructurados , 2008 .

[21]  Feng Qi,et al.  Trajectory data analyses for pedestrian space-time activity study. , 2013, Journal of visualized experiments : JoVE.

[22]  Miriam J. Metzger,et al.  The credibility of volunteered geographic information , 2008 .

[23]  Partha Mukherjee,et al.  Clustering Analysis of Brain Protein Expression Levels in Trisomic and Control Mice , 2019, ICISDM.

[24]  Aitor Àvila Callau,et al.  Landscape characterization using photographs from crowdsourced platforms: content analysis of social media photographs , 2019, Open Geosciences.

[25]  Aitor Àvila Callau,et al.  Dataset on georeferenced and tagged photographs for ecosystem services assessment, Ebro Delta, N-E Spain , 2020 .

[26]  Sara J. Czaja,et al.  The impact of aging on access to technology , 2005, ASAC.

[27]  Catherine Marina Pickering,et al.  Using volunteered geographic information to assess park visitation: Comparing three on-line platforms , 2017 .

[28]  D. Serrano Giné,et al.  Visitor monitoring in protected areas: an approach to Natura 2000 sites using Volunteered Geographic Information (VGI) , 2019, Geografisk Tidsskrift-Danish Journal of Geography.

[29]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[30]  M. Rzeszewski,et al.  Usability and usefulness of internet mapping platforms in participatory spatial planning , 2019, Applied Geography.

[31]  Feng Qi,et al.  Tracking and visualization of space-time activities for a micro-scale flu transmission study , 2013, International Journal of Health Geographics.

[32]  Michael F. Goodchild,et al.  Assuring the quality of volunteered geographic information , 2012 .

[33]  R. L. Thorndike Who belongs in the family? , 1953 .