A novel target detection algorithm combining foreground and background manifold-based models

This paper focuses on the detection of small objects—more precisely on vehicles in aerial images—on complex backgrounds such as natural backgrounds. A key contribution of the paper is to show that, in such situations, learning a target model and a background model separately is better than training a unique discriminative model. This contrasts with standard object detection approaches for which objects vs. background classifiers use the same model as well as the same types of visual features for both. The second contribution lies in the manifold learning approach introduced to build these models. The proposed detection algorithm is validated on the publicly available OIRDS dataset, on which we obtain state-of-the-art results.

[1]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[2]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[7]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[8]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[10]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[12]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[13]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[15]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[16]  Line Eikvil,et al.  Classification-based vehicle detection in high-resolution satellite images , 2009 .

[17]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[18]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, CVPR 2004.

[19]  Raphaël Féraud,et al.  A Fast and Accurate Face Detector Based on Neural Networks , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, IEEE Transactions on Image Processing.

[23]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ted Wong,et al.  ATR Applications in Military Missions , 2007, 2007 IEEE Symposium on Computational Intelligence in Security and Defense Applications.

[25]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[26]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[27]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[29]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[30]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[31]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[32]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[33]  Ramakant Nevatia,et al.  Car detection in low resolution aerial images , 2003, Image Vis. Comput..

[34]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[35]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[36]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[38]  Yongsheng Gao,et al.  Parametric Manifold of an Object under Different Viewing Directions , 2012, ECCV.

[39]  Uwe Soergel,et al.  AIRBORNE MONITORING OF VEHICLE ACTIVITY IN URBAN AREAS , 2004 .

[40]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[41]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[42]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[43]  Frédéric Jurie,et al.  Small Target Detection combining Foreground and Background Manifolds , 2013, MVA.

[44]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Pietro Perona,et al.  Is bottom-up attention useful for object recognition? , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[46]  Veronica Carlan,et al.  Overhead imagery research data set — an annotated data library & tools to aid in the development of computer vision algorithms , 2009, 2009 IEEE Applied Imagery Pattern Recognition Workshop (AIPR 2009).

[47]  Ann B. Lee,et al.  Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[49]  Leslie S. Smith,et al.  The principal components of natural images , 1992 .

[50]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[51]  Peyman Milanfar,et al.  Visual saliency for automatic target detection, boundary detection, and image quality assessment , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[52]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[53]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[54]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[55]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[56]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[60]  Nando de Freitas,et al.  A Statistical Model for General Contextual Object Recognition , 2004, ECCV.

[61]  Luc Van Gool,et al.  Object Detection by Contour Segment Networks , 2006, ECCV.

[62]  George D. C. Cavalcanti,et al.  A weighted image reconstruction based on PCA for pedestrian detection , 2011, The 2011 International Joint Conference on Neural Networks.

[63]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[64]  Serge J. Belongie,et al.  Integral Channel Features - Addendum , 2009 .

[65]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Li Li,et al.  An Artificial Immune Approach for Vehicle Detection from High Resolution Space Imagery , 2007 .

[67]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Daniel P. Huttenlocher,et al.  Composite Models of Objects and Scenes for Category Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Greg Mori,et al.  Detecting Pedestrians by Learning Shapelet Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.