A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning

We have created a large diverse set of cars from overhead images, which are useful for training a deep learner to binary classify, detect and count them. The dataset and all related material will be made publically available. The set contains contextual matter to aid in identification of difficult targets. We demonstrate classification and detection on this dataset using a neural network we call ResCeption. This network combines residual learning with Inception-style layers and is used to count cars in one look. This is a new way to count objects rather than by localization or density estimation. It is fairly accurate, fast and easy to implement. Additionally, the counting method is not car or scene specific. It would be easy to train this method to count other kinds of objects and counting over new scenes requires no extra set up or assumptions about object locations.

[1]  Farid Melgani,et al.  Automatic Car Counting Method for Unmanned Aerial Vehicle Images , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[4]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Parallel Deep Convolutional Neural Networks , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[5]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Jamie Sherrah,et al.  Aerial Car Detection and Urban Understanding , 2015, 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Veronica Carlan,et al.  Overhead imagery research data set — an annotated data library & tools to aid in the development of computer vision algorithms , 2009, 2009 IEEE Applied Imagery Pattern Recognition Workshop (AIPR 2009).

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[11]  Mark Fisher,et al.  Convolutional Neural Networks for Counting Fish in Fisheries Surveillance Video , 2015 .

[12]  Frédéric Jurie,et al.  Vehicle detection in aerial imagery : A small target detection benchmark , 2016, J. Vis. Commun. Image Represent..

[13]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Xiaochun Cao,et al.  Deep People Counting in Extremely Dense Crowds , 2015, ACM Multimedia.

[15]  Andrew Zisserman,et al.  Interactive Object Counting , 2014, ECCV.

[16]  Jordi Vitrià,et al.  Learning to count with deep object features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  P. Gong,et al.  Object-based Detection and Classification of Vehicles from High-resolution Aerial Photography , 2009 .

[18]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[19]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[20]  Joachim Denzler,et al.  Efficient Convolutional Patch Networks for Scene Understanding , 2015 .

[21]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Gaurav S. Sukhatme,et al.  Visual-inertial simultaneous localization, mapping and sensor-to-sensor self-calibration , 2009, 2009 IEEE International Symposium on Computational Intelligence in Robotics and Automation - (CIRA).

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.