Generative part-based Gabor object detector

Probabilistic local part descriptor using complex-valued Gabor feature bank.Alternative approach using a large Gabor filter bank and part specific optimization.Randomized Gaussian mixture model robust to a small number of training examples.Probabilistic part constellation model in "aligned object space".Part-based "pose quantization" that provides pose invariant detection. Display Omitted Discriminative part-based models have become the approach for visual object detection. The models learn from a large number of positive and negative examples with annotated class labels and location (bounding box). In contrast, we propose a part-based generative model that learns from a small number of positive examples. This is achieved by utilizing "privileged information", sparse class-specific landmarks with semantic meaning. Our method uses bio-inspired complex-valued Gabor features to describe local parts. Gabor features are transformed to part probabilities by unsupervised Gaussian Mixture Model (GMM). GMM estimation is robustified for a small amount of data by a randomization procedure inspired by random forests. The GMM framework is also used to construct a probabilistic spatial model of part configurations. Our detector is invariant to translation, rotation and scaling. On part level invariance is achieved by pose quantization which is more efficient than previously proposed feature transformations. In the spatial model, invariance is achieved by mapping parts to an "aligned object space". Using a small number of positive examples our generative method performs comparably to the state-of-the-art discriminative method.

[1]  Jiri Matas,et al.  Feature-based affine-invariant localization of faces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hao Su,et al.  Object Bank: An Object-Level Image Representation for High-Level Visual Recognition , 2014, International Journal of Computer Vision.

[3]  Shuicheng Yan,et al.  Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Jin Ho Kim,et al.  Distortion-Invariant Object Recognition by Optimization Neural Network , 1990 .

[5]  LinLin Shen,et al.  A review on Gabor wavelets for face recognition , 2006, Pattern Analysis and Applications.

[6]  Kai-Kuang Ma,et al.  Rotation-invariant and scale-invariant Gabor features for texture image retrieval , 2007, Image Vis. Comput..

[7]  Andrew Zisserman,et al.  Deep Fisher Networks for Large-Scale Image Classification , 2013, NIPS.

[8]  Ming Yang,et al.  Regionlets for Generic Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[10]  Joni-Kristian Kämäräinen,et al.  Simple Gabor feature space for invariant object recognition , 2004, Pattern Recognit. Lett..

[11]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Joni-Kristian Kämäräinen,et al.  Feature representation and discrimination based on Gaussian mixture model probability densities - Practices and algorithms , 2006, Pattern Recognit..

[17]  Joni-Kristian Kämäräinen,et al.  Measuring Translation Shiftability of Frames , 2006, Comput. Math. Appl..

[18]  Ke Chen,et al.  Learning Generative Models of Object Parts from a Few Positive Examples , 2014, 2014 22nd International Conference on Pattern Recognition.

[19]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[21]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, CAIP.

[22]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[23]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[24]  Joni-Kristian Kämäräinen,et al.  Invariance properties of Gabor filter-based features-overview and applications , 2006, IEEE Transactions on Image Processing.

[25]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[26]  Ke Chen,et al.  Density-Aware Part-Based Object Detection with Positive Examples , 2014, 2014 22nd International Conference on Pattern Recognition.

[27]  Wenze Hu,et al.  Learning Inhomogeneous FRAME Models for Object Patterns , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  John Daugman,et al.  High Confidence Visual Recognition of Persons by a Test of Statistical Independence , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Ben J. A. Kröse,et al.  Efficient Greedy Learning of Gaussian Mixture Models , 2003, Neural Computation.

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  A.V. Oppenheim,et al.  The importance of phase in signals , 1980, Proceedings of the IEEE.

[32]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[33]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[36]  Maja Pantic,et al.  Local Evidence Aggregation for Regression-Based Facial Point Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Tieniu Tan,et al.  Feature Coding in Image Classification: A Comprehensive Study , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Lionel Prevost,et al.  Multiple kernel learning SVM and statistical validation for facial landmark detection , 2011, Face and Gesture 2011.

[39]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[41]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[42]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[43]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[44]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Timothy F. Cootes,et al.  Active shape models , 1998 .