Cost-Effective Class-Imbalance Aware CNN for Vehicle Localization and Categorization in High Resolution Aerial Images

Joint vehicle localization and categorization in high resolution aerial images can provide useful information for applications such as traffic flow structure analysis. To maintain sufficient features to recognize small-scaled vehicles, a regions with convolutional neural network features (R-CNN) -like detection structure is employed. In this setting, cascaded localization error can be averted by equally treating the negatives and differently typed positives as a multi-class classification task, but the problem of class-imbalance remains. To address this issue, a cost-effective network extension scheme is proposed. In it, the correlated convolution and connection costs during extension are reduced by feature map selection and bi-partite main-side network construction, which are realized with the assistance of a novel feature map class-importance measurement and a new class-imbalance sensitive main-side loss function. By using an image classification dataset established from a set of traditional real-colored aerial images with 0.13 m ground sampling distance which are taken from the height of 1000 m by an imaging system composed of non-metric cameras, the effectiveness of the proposed network extension is verified by comparing with its similarly shaped strong counter-parts. Experiments show an equivalent or better performance, while requiring the least parameter and memory overheads are required.

[1]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[2]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[3]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Xinkai Wu,et al.  A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images , 2016, Sensors.

[8]  Andrew Zisserman,et al.  Interactive Object Counting , 2014, ECCV.

[9]  Witold Pedrycz,et al.  Dual autoencoders features for imbalance classification problem , 2016, Pattern Recognit..

[10]  Luís Torgo,et al.  A Survey of Predictive Modelling under Imbalanced Distributions , 2015, ArXiv.

[11]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jianbo Shi,et al.  DeepEdge: A multi-scale bifurcated deep network for top-down contour detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[15]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[16]  Mohammed Bennamoun,et al.  Cost-Sensitive Learning of Deep Feature Representations From Imbalanced Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[17]  P. Gong,et al.  Object-based Detection and Classification of Vehicles from High-resolution Aerial Photography , 2009 .

[18]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Antoni B. Chan,et al.  Small instance detection by integer programming on object density maps , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[21]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[23]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[24]  Ding Wang,et al.  Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey , 2015, International Journal of Automation and Computing.

[25]  Diogo Almeida,et al.  Resnet in Resnet: Generalizing Residual Architectures , 2016, ArXiv.

[26]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[28]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Lin Lei,et al.  Vehicle Detection in Aerial Images Based on Region Convolutional Neural Networks and Hard Negative Example Mining , 2017, Sensors.

[30]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[31]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[32]  Gaofeng Meng,et al.  Vehicle Detection in Satellite Images by Incorporating Objectness and Convolutional Neural Network , 2016 .

[33]  Abhinav Gupta,et al.  Transferring Rich Feature Hierarchies for Robust Visual Tracking , 2015, ArXiv.

[34]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[35]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Liujuan Cao,et al.  Robust vehicle detection by combining deep features with exemplar classification , 2016, Neurocomputing.

[37]  Lance Chun Che Fung,et al.  Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm , 2010, ICONIP.

[38]  Andrew J. R. Simpson Over-Sampling in a Deep Neural Network , 2015, ArXiv.

[39]  Qi Wang,et al.  Salient Band Selection for Hyperspectral Image Classification via Manifold Ranking , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Yuan Yuan,et al.  Congested scene classification via efficient unsupervised feature learning and density estimation , 2016, Pattern Recognit..

[41]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[42]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[43]  Gong Cheng,et al.  RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Adam Krzyzak,et al.  Analysis of Feature Maps Selection in Supervised Learning Using Convolutional Neural Networks , 2014, Canadian Conference on AI.

[45]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[46]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Frédéric Jurie,et al.  Vehicle detection in aerial imagery : A small target detection benchmark , 2016, J. Vis. Commun. Image Represent..

[48]  Bin Xu,et al.  Vehicle Detection in Aerial Images Using Modified YOLO , 2019, 2019 IEEE 19th International Conference on Communication Technology (ICCT).

[49]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[50]  Gellért Máttyus,et al.  Fast Multiclass Vehicle Detection on Aerial Images , 2015, IEEE Geoscience and Remote Sensing Letters.

[51]  Hayit Greenspan,et al.  Chest pathology identification using deep feature selection with non-medical training , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[52]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[53]  Gerald Schaefer,et al.  Cost-sensitive decision tree ensembles for effective imbalanced classification , 2014, Appl. Soft Comput..

[54]  Shilei Sun,et al.  Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks , 2016, Multimedia Tools and Applications.

[55]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[56]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[57]  Bin Yang,et al.  Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Qixiang Ye,et al.  Orientation robust object detection in aerial images using deep convolutional neural network , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[59]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[60]  Pengfei Wang,et al.  Jointly Feature Learning and Selection for Robust Tracking via a Gating Mechanism , 2016, PloS one.

[61]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[62]  Uwe Stilla,et al.  Airborne Vehicle Detection in Dense Urban Areas Using HoG Features and Disparity Maps , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[63]  Marius Leordeanu,et al.  Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery , 2016, ArXiv.

[64]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[66]  Masakazu Matsugu,et al.  Unsupervised Feature Selection for Multi-class Object Detection Using Convolutional Neural Networks , 2004, ISNN.

[67]  Yan Wang,et al.  DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Yunqian Ma,et al.  Imbalanced Learning: Foundations, Algorithms, and Applications , 2013 .

[69]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[70]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[71]  Ming Shao,et al.  Families in the Wild (FIW): Large-Scale Kinship Image Database and Benchmarks , 2016, ACM Multimedia.

[72]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[73]  Xiang Yu,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2016 .

[74]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[75]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[78]  J. Reitberger,et al.  Automatic car detection in high resolution urban scenes based on an adaptive 3D-model , 2003, 2003 2nd GRSS/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas.

[79]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.