A mixed generative-discriminative framework for pedestrian classification

This paper presents a novel approach to pedestrian classification which involves utilizing the synthesized virtual samples of a learned generative model to enhance the classification performance of a discriminative model. Our generative model captures prior knowledge about the pedestrian class in terms of a number of probabilistic shape and texture models, each attuned to a particular pedestrian pose. Active learning provides the link between the generative and discriminative model, in the sense that the former is selectively sampled such that the training process is guided towards the most informative samples of the latter. In large-scale experiments on real-world datasets of tens of thousands of samples, we demonstrate a significant improvement in classification performance of the combined generative-discriminative approach over the discriminative-only approach (the latter exemplified by a neural network with local receptive fields and a support vector machine using Haar wavelet features).

[1]  Thomas Serre,et al.  Categorization by Learning and Combining Object Parts , 2001, NIPS.

[2]  David Beymer,et al.  Face recognition from one example view , 1995, Proceedings of IEEE International Conference on Computer Vision.

[3]  Dariu Gavrila,et al.  Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.

[4]  Shaogang Gong,et al.  A Multi-View Nonlinear Active Shape Model Using Kernel PCA , 1999, BMVC.

[5]  Ilkay Ulusoy,et al.  Generative versus discriminative methods for object recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Ishwar K. Sethi,et al.  Confidence-based active learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  David C. Hogg,et al.  Improving Specificity in PDMs using a Hierarchical Approach , 1997, BMVC.

[8]  Leslie Pack Kaelbling,et al.  Virtual Training for Multi-View Object Class Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Lakhmi C. Jain,et al.  New Learning Paradigms in Soft Computing , 2002 .

[10]  Dariu Gavrila,et al.  Virtual sample generation for template-based shape matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Christopher Joseph Pal,et al.  Multi-Conditional Learning: Generative/Discriminative Training for Clustering and Classification , 2006, AAAI.

[12]  Tomaso Poggio,et al.  Incorporating prior information in machine learning by creating virtual examples , 1998, Proc. IEEE.

[13]  Andrea Vedaldi,et al.  Boosting Invariance and Efficiency in Supervised Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[15]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Stefano Soatto,et al.  Layered active appearance models , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Christian Wöhler,et al.  PII: S0262-8856(98)00108-5 , 1999 .

[18]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Timothy F. Cootes,et al.  A mixture model for representing shape variation , 1999, Image Vis. Comput..

[20]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[21]  Zhuowen Tu,et al.  Learning Generative Models via Discriminative Approaches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  M. Hasenjäger,et al.  Active learning in neural networks , 2002 .

[27]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Shih-Fu Chang,et al.  A Generative-Discriminative Hybrid Method for Multi-View Object Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[31]  Lixin Fan,et al.  Pedestrian registration in static images with unconstrained background , 2003, Pattern Recognit..

[32]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[33]  Tomaso A. Poggio,et al.  Multidimensional morphable models , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[34]  Timothy F. Cootes,et al.  A Non-linear Generalisation of PDMs using Polynomial Regression , 1994, BMVC.

[35]  Dean A. Pomerleau,et al.  Neural Network Vision for Robot Driving , 1997 .