Exploring semantic elements for urban scene recognition: Deep integration of high-resolution imagery and OpenStreetMap (OSM)

Abstract Urban scenes refer to city blocks which are basic units of megacities, they play an important role in citizens’ welfare and city management. Remote sensing imagery with largescale coverage and accurate target descriptions, has been regarded as an ideal solution for monitoring the urban environment. However, due to the heterogeneity of remote sensing images, it is difficult to access their geographical content at the object level, let alone understanding urban scenes at the block level. Recently, deep learning-based strategies have been applied to interpret urban scenes with remarkable accuracies. However, the deep neural networks require a substantial number of training samples which are hard to satisfy, especially for high-resolution images. Meanwhile, the crowed-sourced Open Street Map (OSM) data provides rich annotation information about the urban targets but may encounter the problem of insufficient sampling (limited by the places where people can go). As a result, the combination of OSM and remote sensing images for efficient urban scene recognition is urgently needed. In this paper, we present a novel strategy to transfer existing OSM data to high-resolution images for semantic element determination and urban scene understanding. To be specific, the object-based convolutional neural network (OCNN) can be utilized for geographical object detection by feeding it rich semantic elements derived from OSM data. Then, geographical objects are further delineated into their functional labels by integrating points of interest (POIs), which contain rich semantic terms, such as commercial or educational labels. Lastly, the categories of urban scenes are easily acquired from the semantic objects inside. Experimental results indicate that the proposed method has an ability to classify complex urban scenes. The classification accuracies of the Beijing dataset are as high as 91% at the object-level and 88% at the scene level. Additionally, we are probably the first to investigate the object level semantic mapping by incorporating high-resolution images and OSM data of urban areas. Consequently, the method presented is effective in delineating urban scenes that could further boost urban environment monitoring and planning with high-resolution images.

[1]  Hongyang Lu,et al.  Classification of High-Resolution Remote-Sensing Image Using OpenStreetMap Information , 2017, IEEE Geoscience and Remote Sensing Letters.

[2]  Thomas Hofmann,et al.  Learning Aerial Image Segmentation From Online Maps , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Alain Rakotomamonjy,et al.  Automatic Feature Learning for Spatio-Spectral Image Classification With Sparse SVM , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[4]  William J. Emery,et al.  Object-Based Convolutional Neural Network for High-Resolution Imagery Classification , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[5]  Bing Zhang,et al.  A Review of Remote Sensing Image Classification Techniques: the Role of Spatio-contextual Information , 2014 .

[6]  Shihong Du,et al.  Hierarchical semantic cognition for urban functional zones with VHR satellite images and POI data , 2017 .

[7]  Bei Zhao,et al.  Scene Semantic Understanding Based on the Spatial Context Relations of Multiple Objects , 2017, Remote. Sens..

[8]  Qingshan Liu,et al.  Cascaded Recurrent Neural Networks for Hyperspectral Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[10]  Qingjie Liu,et al.  Road Extraction by Deep Residual U-Net , 2017, IEEE Geoscience and Remote Sensing Letters.

[11]  Hannes Taubenböck,et al.  Unsupervised change detection in VHR remote sensing imagery - an object-based clustering approach in a dynamic urban environment , 2017, Int. J. Appl. Earth Obs. Geoinformation.

[12]  Pierre Alliez,et al.  Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Liangpei Zhang,et al.  Scene Classification Based on the Fully Sparse Semantic Topic Model , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Thomas Blaschke,et al.  Geographic Object-Based Image Analysis – Towards a new paradigm , 2014, ISPRS journal of photogrammetry and remote sensing : official publication of the International Society for Photogrammetry and Remote Sensing.

[15]  Krzysztof Janowicz,et al.  Extracting urban functional regions from points of interest and human activities on location‐based social networks , 2017, Trans. GIS.

[16]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[17]  Fahad Shahbaz Khan,et al.  Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification , 2017, ArXiv.

[18]  Liangpei Zhang,et al.  Scene Classification Based on the Multifeature Fusion Probabilistic Topic Model for High Spatial Resolution Remote Sensing Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[19]  Xiao Xiang Zhu,et al.  Building Instance Classification Using Street View Images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[20]  M. Haklay How Good is Volunteered Geographical Information? A Comparative Study of OpenStreetMap and Ordnance Survey Datasets , 2010 .

[21]  Jefersson Alex dos Santos,et al.  Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..

[22]  George Vosselman,et al.  Building extraction from oblique airborne imagery based on robust façade detection , 2012 .

[23]  Qingshan Liu,et al.  Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification , 2017, Remote. Sens..

[24]  Volker Walter,et al.  Object-based classification of remote sensing data for change detection , 2004 .

[25]  Yanfei Zhong,et al.  A spectral–structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery , 2016 .

[26]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[27]  Thomas Blaschke,et al.  Object based image analysis for remote sensing , 2010 .

[28]  Xiaoping Liu,et al.  Classifying urban land use by integrating remote sensing and social media data , 2017, Int. J. Geogr. Inf. Sci..

[29]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[30]  B. Johnson,et al.  Integrating OpenStreetMap crowdsourced data and Landsat time-series imagery for rapid land use/land cover (LULC) mapping: Case study of the Laguna de Bay area of the Philippines , 2016 .

[31]  P. Gong,et al.  Object-based Detailed Vegetation Classification with Airborne High Spatial Resolution Remote Sensing Imagery , 2006 .

[32]  Gang Fu,et al.  Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network , 2017, Remote. Sens..

[33]  Gui-Song Xia,et al.  Dirichlet-Derived Multiple Topic Scene Classification Model for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[34]  Hong Sun,et al.  Unsupervised Feature Learning Via Spectral Clustering of Multidimensional Patches for Remotely Sensed Scene Classification , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[35]  Xiaoping Liu,et al.  Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model , 2017, Int. J. Geogr. Inf. Sci..

[36]  Patrick Weber,et al.  OpenStreetMap: User-Generated Street Maps , 2008, IEEE Pervasive Computing.

[37]  Michael Schultz,et al.  Open land cover from OpenStreetMap and remote sensing , 2017, Int. J. Appl. Earth Obs. Geoinformation.

[38]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[39]  D. Quattrochi,et al.  Urban Remote Sensing , 2006 .

[40]  Qihao Weng,et al.  Remote sensing of impervious surfaces in the urban areas: Requirements, methods, and trends , 2012 .

[41]  William J. Emery,et al.  Contextually guided very-high-resolution imagery classification with semantic segments , 2017 .

[42]  Liangpei Zhang,et al.  On Combining Multiple Features for Hyperspectral Remote Sensing Image Classification , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[43]  C. Aubrecht,et al.  Integrating earth observation and GIScience for high resolution spatial and functional modeling of urban land use , 2009, Comput. Environ. Urban Syst..

[44]  Nataliia Kussul,et al.  Deep Learning Classification of Land Cover and Crop Types Using Remote Sensing Data , 2017, IEEE Geoscience and Remote Sensing Letters.

[45]  Peng Gong,et al.  Mapping Urban Land Use by Using Landsat Images and Open Social Data , 2016, Remote. Sens..

[46]  Xiaoqiang Lu,et al.  Remote Sensing Image Scene Classification: Benchmark and State of the Art , 2017, Proceedings of the IEEE.

[47]  Qingshan Liu,et al.  Hyperspectral Image Classification Using Spectral-Spatial LSTMs , 2017, CCCV.