Configuration Estimates Improve Pedestrian Finding

Fair discriminative pedestrian finders are now available. In fact, these pedestrian finders make most errors on pedestrians in configurations that are uncommon in the training data, for example, mounting a bicycle. This is undesirable. However, the human configuration can itself be estimated discriminatively using structure learning. We demonstrate a pedestrian finder which first finds the most likely human pose in the window using a discriminative procedure trained with structure learning on a small dataset. We then present features (local histogram of oriented gradient and local PCA of gradient) based on that configuration to an SVM classifier. We show, using the INRIA Person dataset, that estimates of configuration significantly improve the accuracy of a discriminative pedestrian finder.

[1]  Dariu Gavrila,et al.  Sensor-Based Pedestrian Protection , 2001, IEEE Intell. Syst..

[2]  Dariu Gavrila,et al.  Pedestrian Detection from a Moving Vehicle , 2000, ECCV.

[3]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[6]  Pascal Fua,et al.  Human Body Pose Recognition Using Spatio-Temporal Templates , 2005, ICCV 2005.

[7]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[8]  E. Rückert Detecting Pedestrians by Learning Shapelet Features , 2007 .

[9]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[10]  Tomaso A. Poggio,et al.  A pattern classification approach to dynamical object detection , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[11]  Nathan D. Ratliff,et al.  Subgradient Methods for Maximum Margin Structured Learning , 2006 .

[12]  David A. Forsyth,et al.  Using temporal coherence to build models of animals , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Gang Hua,et al.  A statistical field model for pedestrian detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Liang Zhao,et al.  Stereo- and neural network-based pedestrian detection , 1999, Proceedings 199 IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems (Cat. No.99TH8383).

[16]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[17]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[18]  Ben Taskar,et al.  Structured Prediction via the Extragradient Method , 2005, NIPS.

[19]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[21]  Naum Zuselevich Shor,et al.  Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.

[22]  Richard Bowden,et al.  Detection and Tracking of Humans by Probabilistic Body Part Assembly , 2005, BMVC.

[23]  David A. Forsyth,et al.  Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis , 2005, Found. Trends Comput. Graph. Vis..

[24]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[25]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Andrew Zisserman,et al.  Extending Pictorial Structures for Object Recognition , 2004, BMVC.

[28]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.