Feature Sets and Dimensionality Reduction for Visual Object Detection

We describe a family of object detectors that provides state-of-the-art error rates on several important datasets including INRIA people and PASCAL VOC'06 and VOC'07. The method builds on a number of recent advances. It uses the Latent SVM learning framework and a rich visual feature set that incorporates Histogram of Oriented Gradient, Local Binary Pattern and Local Ternary Pattern descriptors. Partial Least Squares dimensionality reduction is included to speed the training of the basic classifier with no loss of accuracy, and to allow a two-stage quadratic classifier that further improves the results. \iflong A simple sparsification technique can reduce the size of the feature set by around 70% with little loss of accuracy.\fi We evaluate our methods and compare them to other recent ones on several datasets. Our basic root detectors outperform the single component part-based ones of Felzenszwalb et.al on 9 of 10 classes of VOC'06 (12% increase in Mean Average Precision) and 11 of 20 classes of VOC'07 (7% increase in MAP). On the INRIA Person dataset, they increase the Average Precision by 12% relative to Dalal \& Triggs.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[3]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[4]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[5]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Christopher K. I. Williams,et al.  Pascal Visual Object Classes Challenge Results , 2005 .

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, AMFG.

[10]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Luc Van Gool,et al.  A mobile vision system for robust multi-person tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, IEEE Transactions on Image Processing.