Speeding-up a convolutional neural network by connecting an SVM network

Deep neural networks yield positive object detection results in aerial imaging. To deal with the massive computational time required, we propose to connect an SVM Network to the different feature maps of a CNN. After the training of this SVM Network, we use an activation path to cross the network in a predefined order. We stop the crossing as quickly as possible. This early exit from the CNN allows us to reduce the computational burden. Experimental results are obtained for an industrial application in urban object detection. We show that potentially the computation cost could be reduced by 98%. Additionally, performance is slightly improved; for example, for a 55% recall, precision increases by 5%.

[1]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[2]  Qixiang Ye,et al.  Orientation robust object detection in aerial images using deep convolutional neural network , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[6]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[11]  Jing Gao,et al.  Converting Output Scores from Outlier Detection Algorithms into Probability Estimates , 2006, Sixth International Conference on Data Mining (ICDM'06).

[12]  Marc Chaumont,et al.  Automatic localization of tombs in aerial imagery: Application to the digital archiving of cemetery heritage , 2013, 2013 Digital Heritage International Congress (DigitalHeritage).

[13]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Marc Chaumont,et al.  Optimizing color information processing inside an SVM network , 2016, Visual Information Processing and Communication.

[15]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[16]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[17]  Yichuan Tang,et al.  Deep Learning using Support Vector Machines , 2013, ArXiv.

[18]  Marc Chaumont,et al.  An efficient multi-resolution SVM network approach for object detection in aerial images , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[19]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.

[22]  Uwe Stilla,et al.  Vehicle Detection in Very High Resolution Satellite Images of City Areas , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[23]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.