Crowd-sourced data collection to support automatic classification of building footprint data

Abstract. Human settlements are mainly formed by buildings with their different characteristics and usage. Despite the importance of buildings for the economy and society, complete regional or even national figures of the entire building stock and its spatial distribution are still hardly available. Available digital topographic data sets created by National Mapping Agencies or mapped voluntarily through a crowd via Volunteered Geographic Information (VGI) platforms (e.g. OpenStreetMap) contain building footprint information but often lack additional information on building type, usage, age or number of floors. For this reason, predictive modeling is becoming increasingly important in this context. The capabilities of machine learning allow for the prediction of building types and other building characteristics and thus, the efficient classification and description of the entire building stock of cities and regions. However, such data-driven approaches always require a sufficient amount of ground truth (reference) information for training and validation. The collection of reference data is usually cost-intensive and time-consuming. Experiences from other disciplines have shown that crowdsourcing offers the possibility to support the process of obtaining ground truth data. Therefore, this paper presents the results of an experimental study aiming at assessing the accuracy of non-expert annotations on street view images collected from an internet crowd. The findings provide the basis for a future integration of a crowdsourcing component into the process of land use mapping, particularly the automatic building classification.

[1]  Steffen Fritz,et al.  Crowdsourcing In-Situ Data on Land Cover and Land Use Using Gamification and Mobile Technology , 2016, Remote. Sens..

[2]  Bin Jiang,et al.  Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced Geographic Information , 2016, ISPRS Int. J. Geo Inf..

[3]  Carl F. Salk,et al.  Comparing the Quality of Crowdsourced Data Contributed by Expert and Non-Experts , 2013, PloS one.

[4]  Giles M. Foody,et al.  Status of land cover classification accuracy assessment , 2002 .

[5]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[6]  Brian A Vander Schee Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business , 2009 .

[7]  Gotthard Meinel,et al.  Automatic identification of building types based on topographic databases – a comparison of different data sources , 2015 .

[8]  Lena Maier-Hein,et al.  Crowdsourcing for Reference Correspondence Generation in Endoscopic Images , 2014, MICCAI.

[9]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[10]  Robert Weibel,et al.  An Approach for the Classification of Urban Building Structures Based on Discriminant Analysis Techniques , 2008, Trans. GIS.

[11]  Alexander Zipf,et al.  GIS-Werkzeuge zur Verbesserung der barrierefreien Routenplanung aus dem Projekt CAP4Access , 2016, AGIT Journal Angew. Geoinformatik.

[12]  Hannes Taubenböck,et al.  Building Types’ Classification Using Shape-Based Features and Linear Discriminant Functions , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[13]  Lalit Kumar,et al.  Comparative assessment of the measures of thematic classification accuracy , 2007 .

[14]  Bernhard Höfle,et al.  Geo-reCAPTCHA: Crowdsourcing large amounts of geographic information from earth observation data , 2015, Int. J. Appl. Earth Obs. Geoinformation.

[15]  Daniel Kondermann,et al.  Is Crowdsourcing for Optical Flow Ground Truth Generation Feasible? , 2013, ICVS.

[16]  Russell G. Congalton,et al.  Assessing the accuracy of remotely sensed data : principles and practices , 1998 .

[17]  Lutz Plümer,et al.  Identifying Architectural Style in 3D City Models with Support Vector Machines , 2010 .

[18]  Fernando González-Ladrón-de-Guevara,et al.  Towards an integrated crowdsourcing definition , 2012, J. Inf. Sci..

[19]  K. McGarigal,et al.  FRAGSTATS: spatial pattern analysis program for quantifying landscape structure. , 1995 .

[20]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[21]  Steffen Fritz,et al.  Geo-Wiki: An online platform for improving global land cover , 2012, Environ. Model. Softw..

[22]  Steffen Fritz,et al.  Assessing quality of volunteer crowdsourcing contributions: lessons from the Cropland Capture game , 2016, Int. J. Digit. Earth.

[23]  Melanie Eckle,et al.  The Tasks of the Crowd: A Typology of Tasks in Geographic Information Crowdsourcing and a Case Study in Humanitarian Mapping , 2016, Remote. Sens..