GROUND 2 SKY LABEL TRANSFER FOR FINE-GRAINED AERIAL CAR RECOGNITION Baochen Sun ?

Overhead images captured by helicopters, unmanned aerial vehicles and satellites are widely available. Prior aerial target recognition methods mainly deal with generic object categories such as cars, roads, and boats. We go beyond this and aim for fine-grained recognition, e.g., distinguishing between a Toyota and a Honda sedan. This task is so challenging for human annotators that labeling images directly is no longer an option: annotators are often unable to identify the object from such an extreme viewpoint and at such a low resolution. We propose a novel solution to collect fine-grained annotations of aerial images and develop the first ground-to-sky cross-view car dataset with instance-level correspondences. We compare the performance of human experts and deep learning approaches on fine-grained car recognition from aerial imagery. Noting that intraclass variation in aerial images is limited, we further show that with simple data augmentation, a classifier can be trained from fewer instances yet achieves comparable or even significantly better performance than human experts. Our experimental evidence demonstrates that fine-grained object recognition from overhead images is not only feasible but also well suited for deep learning methods. Our dataset is available at: http://ai.bu.edu/Ground2Sky/

[1]  Jonathan Krause,et al.  Visual Census: Using Cars to Study People and Society , 2017 .

[2]  Liujuan Cao,et al.  Robust vehicle detection by combining deep features with exemplar classification , 2016, Neurocomputing.

[3]  Wesam A. Sakla,et al.  A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning , 2016, ECCV.

[4]  James Hays,et al.  Localizing and Orienting Street Views Using Overhead Imagery , 2016, ECCV.

[5]  Pietro Perona,et al.  Cataloging Public Objects Using Aerial and Street-Level Images — Urban Trees , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Scott Workman,et al.  Wide-Area Image Geolocalization with Aerial Reference Imagery , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Serge J. Belongie,et al.  Learning deep representations for ground-to-aerial geolocalization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Steven M. Seitz,et al.  Accurate Geo-Registration by Ground-to-Aerial Image Matching , 2014, 2014 2nd International Conference on 3D Vision.

[11]  Daniel Huber,et al.  Vision based robot localization by ground to satellite matching in GPS-denied situations , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[13]  Erik Blasch,et al.  Vehicle change detection from aerial imagery using detection response maps , 2014, Defense + Security Symposium.

[14]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[15]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[17]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Parallel Deep Convolutional Neural Networks , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[18]  Jiangye Yuan,et al.  Systematic Benchmarking of Aerial Image Segmentation , 2013, IEEE Geoscience and Remote Sensing Letters.

[19]  Serge J. Belongie,et al.  Cross-View Image Geolocalization , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  David W. Jacobs,et al.  Dog Breed Classification Using Part Localization , 2012, ECCV.

[22]  C. V. Jawahar,et al.  Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Fei-Fei Li,et al.  Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs , 2012 .

[24]  Hui Cheng,et al.  Geo-localization of street views with aerial image databases , 2011, ACM Multimedia.

[25]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[26]  Terrence Fong,et al.  Vehicle detection from aerial imagery , 2011, 2011 IEEE International Conference on Robotics and Automation.

[27]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[29]  Jiebo Luo,et al.  Event recognition: viewing the world with a third eye , 2008, ACM Multimedia.