Autoencodeurs discriminants pour la détection et la reconnaissance de véhicules en imagerie aérienne

RÉSUMÉ. Les autoencodeurs, qui permettent de modéliser des données au moyen de variétés, peuvent être utilisés dans un contexte de détection d’objets pour modéliser l’apparence des classes d’objets à détecter. La distance entre un vecteur à classer et la variété peut alors être utilisée comme une mesure de probabilité d’appartenance du vecteur à la classe. Cependant, en construisant la variété de manière à ce que les vecteurs de la classe appartiennent à la variété, rien ne garantit que des vecteurs d’autres classes ne lui appartiennent pas également. Nous cherchons à lever cette limitation en proposant un nouveau type d’autoencodeurs, les autoencodeurs discriminants, qui ont la propriété de construire des variétés éloignant les formes n’appartenant pas à la classe d’objets à détecter de la variété. Une validation expérimentale dans un contexte de détection et reconnaissance de véhicules en imagerie aérienne permet de conclure sur la pertinence de la méthode proposée.

[1]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Raphaël Féraud,et al.  A Fast and Accurate Face Detector Based on Neural Networks , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yongsheng Gao,et al.  Parametric Manifold of an Object under Different Viewing Directions , 2012, ECCV.

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[6]  Frédéric Jurie,et al.  Small Target Detection combining Foreground and Background Manifolds , 2013, MVA.

[7]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[8]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Uwe Soergel,et al.  AIRBORNE MONITORING OF VEHICLE ACTIVITY IN URBAN AREAS , 2004 .

[10]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[12]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[13]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[14]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[15]  Frédéric Jurie,et al.  Vehicle detection in aerial imagery : A small target detection benchmark , 2016, J. Vis. Commun. Image Represent..

[16]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[17]  Ramakant Nevatia,et al.  Car detection in low resolution aerial images , 2003, Image Vis. Comput..

[18]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[21]  Peyman Milanfar,et al.  Visual saliency for automatic target detection, boundary detection, and image quality assessment , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[27]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[28]  Frédéric Jurie,et al.  PCCA: A new approach for distance learning from sparse pairwise constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[30]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Frédéric Jurie,et al.  Autoencodeurs discriminants pour la détection de cibles faiblement résolues , 2014 .

[33]  George D. C. Cavalcanti,et al.  A weighted image reconstruction based on PCA for pedestrian detection , 2011, The 2011 International Joint Conference on Neural Networks.

[34]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[35]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Jasper Snoek,et al.  Nonparametric guidance of autoencoder representations using label information , 2012, J. Mach. Learn. Res..

[37]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[38]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[39]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.