AiRound and CV-BrCT: Novel Multiview Datasets for Scene Classification

It is undeniable that aerial/satellite images can provide useful information for a large variety of tasks. But, since these images are always taken from above, some applications can benefit from complementary information provided by other perspective views of the scene, such as ground-level images. Despite a large number of public repositories for both georeferenced photographs and aerial images, there is a lack of benchmark datasets that allow the development of approaches that exploit the benefits and complementarity of aerial/ground imagery. In this article, we present two new publicly available datasets named AiRound and CV-BrCT. The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world. Each triplet is composed of an aerial RGB image, a ground-level perspective image, and a Sentinel-2 sample. The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil. We design an extensive set of experiments concerning multiview scene classification, using early and late fusion. Such experiments were conducted to show that image classification can be enhanced using multiview data.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Joshua B. Tenenbaum,et al.  The Omniglot challenge: a 3-year progress report , 2019, Current Opinion in Behavioral Sciences.

[3]  Scott Workman,et al.  Predicting Ground-Level Scene Layout from Aerial Imagery , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Friedrich Fraundorfer,et al.  Automatic Alignment of Indoor and Outdoor Building Models Using 3D Line Segments , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[6]  Hongdong Li,et al.  Lending Orientation to Neural Networks for Cross-View Geo-Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Louis-Philippe Morency,et al.  Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Michael Dixon,et al.  Google Earth Engine: Planetary-scale geospatial analysis for everyone , 2017 .

[10]  Davide Scaramuzza,et al.  MAV urban localization from Google street view data , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Wenjun Zeng,et al.  Cross View Fusion for 3D Human Pose Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Serge J. Belongie,et al.  Learning deep representations for ground-to-aerial geolocalization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Sébastien Lefèvre,et al.  Coupling ground-level panoramas and aerial imagery for change detection , 2016, Geo spatial Inf. Sci..

[14]  Xiao Xiang Zhu,et al.  Model Fusion for Building Type Classification from Aerial and Street View Images , 2019, Remote. Sens..

[15]  Wei Tu,et al.  Integrating Aerial and Street View Images for Urban Land Use Classification , 2018, Remote. Sens..

[16]  Horst Bischof,et al.  Evaluations on multi-scale camera networks for precise and geo-accurate reconstructions from aerial and terrestrial images with user guidance , 2017, Comput. Vis. Image Underst..

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Scott Workman,et al.  Wide-Area Image Geolocalization with Aerial Reference Imagery , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Scott Workman,et al.  A Unified Model for Near and Remote Sensing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Pietro Perona,et al.  Cataloging Public Objects Using Aerial and Street-Level Images — Urban Trees , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Gim Hee Lee,et al.  CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[26]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[27]  Devis Tuia,et al.  Understanding urban landuse from the above and ground perspectives: a deep learning, multimodal solution , 2019, Remote Sensing of Environment.

[28]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[29]  Jefersson Alex dos Santos,et al.  Towards better exploiting convolutional neural networks for remote sensing scene classification , 2016, Pattern Recognit..

[30]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Mubarak Shah,et al.  Cross-View Image Matching for Geo-Localization in Urban Environments , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jian Yang,et al.  Selective Kernel Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Remis Balaniuk,et al.  Brazildam: A Benchmark Dataset For Tailings Dam Detection , 2020, 2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS).

[34]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.