Integrating Crowdsourced Data with a Land Cover Product: A Bayesian Data Fusion Approach

For many environmental applications, an accurate spatial mapping of land cover is a major concern. Currently, land cover products derived from satellite data are expected to offer a fast and inexpensive way of mapping large areas. However, the quality of these products may also largely depend on the area under study. As a result, it is common that various products disagree with each other, and the assessment of their respective quality still relies on ground validation datasets. Recently, crowdsourced data have been suggested as an alternate source of information that might help overcome this problem. However, crowdsourced data still remain largely discarded in scientific studies due to their inherent poor quality assurance. The aim of this paper is to present an efficient methodology that allows the user to code information brought by crowdsourced data even if no prior quality estimation is at hand and possibly to fuse this information with existing land cover products in order to improve their accuracy. It is first suggested that information brought by volunteers can be coded as a set of inequality constraints about the probabilities of the various land use classes at the visited places. This in turn allows estimating optimal probabilities based on a maximum entropy principle and to proceed afterwards with a spatial interpolation of these volunteers’ information. Finally, a Bayesian data fusion approach can be used for fusing multiple volunteers’ contributions with a remotely-sensed land cover product. This methodology is illustrated in this paper by focusing on the mapping of croplands in Ethiopia, where the aim is to improve the mapping of cropland as coming out from a land cover product with mitigated performances. It is shown how crowdsourced information can seriously improve the quality of the final product. The corresponding results also suggest that a prior assessing of remotely-sensed data quality can seriously improve the benefit of crowdsourcing campaigns, so that both sources of information need to be accounted together in order to optimize the sampling efforts.

[1]  Patrick Bogaert,et al.  Combining categorical information with the Bayesian Maximum Entropy approach , 2004 .

[2]  Steffen Fritz,et al.  Improved global cropland data as an essential ingredient for food security , 2015 .

[3]  J. Townshend,et al.  Global land cover classi(cid:142) cation at 1 km spatial resolution using a classi(cid:142) cation tree approach , 2004 .

[4]  K. Moffett,et al.  Remote Sens , 2015 .

[5]  S. Gorman,et al.  Volunteered Geographic Information and Crowdsourcing Disaster Relief: A Case Study of the Haitian Earthquake , 2010 .

[6]  B. Johnson,et al.  Integrating OpenStreetMap crowdsourced data and Landsat time-series imagery for rapid land use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines , 2016 .

[7]  Jing Chen,et al.  A Bayesian Based Method to Generate a Synergetic Land-Cover Map from Existing Land-Cover Products , 2014, Remote. Sens..

[8]  Patrick Bogaert,et al.  MinNorm approximation of MaxEnt/MinDiv problems for probability tables , 2015 .

[9]  Michael F. Goodchild,et al.  Assuring the quality of volunteered geographic information , 2012 .

[10]  Michael P. Peterson,et al.  Crowdsourcing Geographic Knowledge. Daniel Sui, Sarah Elwood, and Michael Goodchild, eds. , 2013 .

[11]  S. Frolking,et al.  Linking remote‐sensing estimates of land cover and census statistics on land use to produce maps of land use of the conterminous United States , 2001 .

[12]  Steffen Fritz,et al.  Mapping Priorities to Focus Cropland Mapping Activities: Fitness Assessment of Existing Global, Regional and National Cropland Maps , 2015, Remote. Sens..

[13]  Steffen Fritz,et al.  Geo-Wiki.Org: The Use of Crowdsourcing to Improve Global Land Cover , 2009, Remote. Sens..

[14]  Jeffrey A. Cardille,et al.  A regression tree-based method for integrating land-cover and land-use data collected at multiple scales , 2007, Environmental and Ecological Statistics.

[15]  Christoph Perger,et al.  Using control data to determine the reliability of volunteered geographic information about land cover , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[16]  Stéphane Roche,et al.  GeoWeb and crisis management: issues and perspectives of volunteered geographic information , 2011, GeoJournal.

[17]  J. San-Miguel-Ayanz,et al.  A methodology to generate a synergetic land-cover map by fusion of different land-cover products , 2012, Int. J. Appl. Earth Obs. Geoinformation.

[18]  Julien Radoux,et al.  Bayesian Data Fusion for Adaptable Image Pansharpening , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Lucy Bastin,et al.  Usability of VGI for validation of land cover maps , 2015, Int. J. Geogr. Inf. Sci..

[20]  Agung Wahyudi,et al.  Maximum entropy estimation of a Benzene contaminated plume using ecotoxicological assays. , 2013, Environmental pollution.

[21]  Martin Jung,et al.  Exploiting synergies of global land cover products for carbon cycle modeling , 2006 .

[22]  Steffen Fritz,et al.  A method to compare and improve land cover datasets: application to the GLC-2000 and MODIS land cover products , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Patrick Bogaert,et al.  Continuous-valued map reconstruction with the Bayesian Maximum Entropy , 2003 .

[24]  Jane Hunter,et al.  Assessing the quality and trustworthiness of citizen science data , 2013, Concurr. Comput. Pract. Exp..

[25]  Giles M. Foody,et al.  Crowdsourcing for climate and atmospheric sciences: current status and future potential , 2015 .

[26]  Michael F. Goodchild,et al.  Please Scroll down for Article International Journal of Digital Earth Crowdsourcing Geographic Information for Disaster Response: a Research Frontier Crowdsourcing Geographic Information for Disaster Response: a Research Frontier , 2022 .

[27]  Patrick Bogaert,et al.  Bayesian data fusion for spatial prediction of categorical variables in environmental sciences , 2014 .

[28]  Steffen Fritz,et al.  Cropland for sub‐Saharan Africa: A synergistic approach using five land cover data sets , 2011 .

[29]  Steffen Fritz,et al.  Mapping Cropland in Ethiopia Using Crowdsourcing , 2013 .

[30]  Dominique Fasbender,et al.  Bayesian data fusion in a spatial prediction context: a general formulation , 2007 .

[31]  Dominique Fasbender,et al.  Bayesian data fusion applied to water table spatial mapping , 2008 .

[32]  Vyron Antoniou,et al.  How Many Volunteers Does it Take to Map an Area Well? The Validity of Linus’ Law to Volunteered Geographic Information , 2010 .

[33]  Patrick Bogaert,et al.  Bayesian Data Fusion Applied to Soil Drainage Classes Spatial Mapping , 2015, Mathematical Geosciences.

[34]  Jamal Jokar Arsanjani,et al.  EXPLOITING VOLUNTEERED GEOGRAPHIC INFORMATION TO EASE LAND USE MAPPING OF AN URBAN LANDSCAPE , 2013 .

[35]  G. Foody Thematic map comparison: Evaluating the statistical significance of differences in classification accuracy , 2004 .

[36]  Matthew Lease,et al.  Semi-Supervised Consensus Labeling for Crowdsourcing , 2011 .

[37]  Johan F.M. Swinnen,et al.  Biofuels and Food Security: Micro-Evidence from Ethiopia , 2012 .