An efficient multi-resolution SVM network approach for object detection in aerial images

In this paper, we deal with the problem of object detection in aerial images. A lot of efficient approaches uses a cascade of classifiers which process vectors of descriptive features such as HOG. In order to take into account the variability in object dimension, features at different resolutions are often concatenated in a large descriptor vector. This prevents from taking into account explicitly the different resolutions but results in losing some valuable information. To overcome this problem, we propose to use a new method based on a SVM network. Each resolution is processed, regardless to the others, at the input layer level, by a dedicated SVM. The main drawback of using such a network is that the computational complexity for the classification phase drastically increases. We propose then to foster an incomplete exploration of the network by defining an activation path. This activation path determines an order to activate the network neurons, one after the other, and introduces a rejection rule which allows the process to end before crossing the whole network. Experimental results are obtained and assessed in an industrial application of urban object detection. We can observe an average gain of 17% in precision while the computational cost is divided by more than 5, with respect to a standard method.

[1]  Shiliang Sun,et al.  Ensembles of Feature Subspaces for Object Detection , 2009, ISNN.

[2]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[3]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Leszek Plaskota,et al.  Information complexity of neural networks , 2000, Neural Networks.

[8]  Jing Gao,et al.  Converting Output Scores from Outlier Detection Algorithms into Probability Estimates , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  Ramón López de Mántaras,et al.  Fast and robust object segmentation with the Integral Linear Classifier , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[11]  Trevor Darrell,et al.  Learning with Recursive Perceptual Representations , 2012, NIPS.

[12]  Yichuan Tang,et al.  Deep Learning using Support Vector Machines , 2013, ArXiv.

[13]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[14]  Horst Bischof,et al.  On-line boosting-based car detection from aerial images , 2008 .

[15]  Xiaogang Wang,et al.  Multi-stage Contextual Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[19]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[22]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Marc Chaumont,et al.  Automatic localization of tombs in aerial imagery: Application to the digital archiving of cemetery heritage , 2013, 2013 Digital Heritage International Congress (DigitalHeritage).

[24]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.