论文信息 - Max-Margin Object Detection

Max-Margin Object Detection

Most object detection methods operate by applying a binary classifier to sub-windows of an image, followed by a non-maximum suppression step where detections on overlapping sub-windows are removed. Since the number of possible sub-windows in even moderately sized image datasets is extremely large, the classifier is typically learned from only a subset of the windows. This avoids the computational difficulty of dealing with the entire set of sub-windows, however, as we will show in this paper, it leads to sub-optimal detector performance. In particular, the main contribution of this paper is the introduction of a new method, Max-Margin Object Detection (MMOD), for learning to detect objects in images. This method does not perform any sub-sampling, but instead optimizes over all sub-windows. MMOD can be used to improve any object detection method which is linear in the learned parameters, such as HOG or bag-of-visual-word models. Using this approach we show substantial performance gains on three publicly available datasets. Strikingly, we show that a single rigid HOG filter can outperform a state-of-the-art deformable part model on the Face Detection Data Set and Benchmark when the HOG filter is learned via MMOD.

Davis E. King

[1] R. Fletcher. Practical Methods of Optimization , 1988 .

[2] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[3] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[4] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5] Derek R. Magee,et al. Detecting lameness using 'Re-sampling Condensation' and 'multi-stream cyclic hidden Markov models' , 2002, Image Vis. Comput..

[6] Thomas Hofmann,et al. Hidden Markov Support Vector Machines , 2003, ICML.

[7] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9] Fatih Murat Porikli,et al. Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Thorsten Joachims,et al. Training structural svms with kernels using sampled cuts , 2008, KDD.

[11] Christoph H. Lampert,et al. Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[12] Ramakant Nevatia,et al. Segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses , 2008, CVPR.

[13] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[14] Thomas Hofmann,et al. Predicting structured objects with support vector machines , 2009, Commun. ACM.

[15] Shihong Lao,et al. Boosting Associated Pairing Comparison Features for pedestrian detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16] Larry S. Davis,et al. Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17] Thorsten Joachims,et al. Cutting-plane training of structural SVMs , 2009, Machine Learning.

[18] Alexander J. Smola,et al. Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[19] Ramakant Nevatia,et al. High performance object detection by collaborative learning of Joint Ranking of Granules features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20] Erik Learned-Miller,et al. FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[21] Horst Bischof,et al. Robust face detection by simple means , 2012 .

[22] Deva Ramanan,et al. Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.