论文信息 - Object Detection with Grammar Models

Object Detection with Grammar Models

Compositional models provide an elegant formalism for representing the visual appearance of highly variable objects. While such models are appealing from a theoretical point of view, it has been difficult to demonstrate that they lead to performance advantages on challenging datasets. Here we develop a grammar model for person detection and show that it outperforms previous high-performance systems on the PASCAL benchmark. Our model represents people using a hierarchy of deformable parts, variable structure and an explicit model of occlusion for partially visible objects. To train the model, we introduce a new discriminative framework for learning structured prediction models from weakly-labeled data.

David A. McAllester | Ross B. Girshick | Pedro F. Felzenszwalb

[1] William T. Freeman,et al. Orientation Histograms for Hand Gesture Recognition , 1995 .

[2] Kazuo Kyuma,et al. Computer vision for computer games , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[3] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .

[4] Dariu Gavrila,et al. The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[5] Dariu Gavrila,et al. Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6] Daniel P. Huttenlocher,et al. Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[7] Jitendra Malik,et al. Matching Shapes , 2001, ICCV.

[8] Tomaso A. Poggio,et al. Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Cordelia Schmid,et al. Learning to Parse Pictures of People , 2002, ECCV.

[10] Luc Van Gool,et al. Efficient pedestrian detection : a test case for SVM based categorization , 2002 .

[11] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[12] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[13] David A. Forsyth,et al. Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[14] Takeo Kanade,et al. Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[15] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16] R. Sukthankar,et al. PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17] D.M. Gavrila,et al. Vision-based pedestrian detection: the PROTECTOR system , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[18] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[19] Tomaso A. Poggio,et al. A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[20] Cordelia Schmid,et al. Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[21] Paul A. Viola,et al. Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[22] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] E. L. Schwartz,et al. Spatial mapping in the primate sensory projection: Analytic structure and relevance to perception , 1977, Biological Cybernetics.

[24] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[26] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[27] Stuart Geman,et al. Context and Hierarchy in a Probabilistic Image Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28] Jason Weston,et al. Trading convexity for scalability , 2006, ICML.

[29] Alexander J. Smola,et al. Tighter Bounds for Structured Estimation , 2008, NIPS.

[30] Christoph H. Lampert,et al. Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[31] Thorsten Joachims,et al. Learning structural SVMs with latent variables , 2009, ICML '09.

[32] Long Zhu,et al. Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Shuicheng Yan,et al. An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] William T. Freeman,et al. Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36] Matthew B. Blaschko,et al. Simultaneous Object Detection and Ranking with Weak Supervision , 2010, NIPS.

[37] Dariu Gavrila,et al. Multi-cue pedestrian classification with partial occlusion handling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38] Antonio Torralba,et al. Part and appearance sharing: Recursive Compositional Models for multi-view , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[39] Subhransu Maji,et al. Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[40] David A. McAllester,et al. Generalization bounds and consistency for latent-structural probit and ramp loss , 2011, MLSLP.

[41] Pedro F. Felzenszwalb. Object detection grammars , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).