Inference and Learning with Hierarchical Shape Models

In this work we introduce a hierarchical representation for object detection. We represent an object in terms of parts composed of contours corresponding to object boundaries and symmetry axes; these are in turn related to edge and ridge features that are extracted from the image.We propose a coarse-to-fine algorithm for efficient detection which exploits the hierarchical nature of the model. This provides a tractable framework to combine bottom-up and top-down computation. We learn our models from training images where only the bounding box of the object is provided. We automate the decomposition of an object category into parts and contours, and discriminatively learn the cost function that drives the matching of the object to the image using Multiple Instance Learning.Using shape-based information, we obtain state-of-the-art localization results on the UIUC and ETHZ datasets.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[3]  Rodney A. Brooks,et al.  The ACRONYM Model-Based Vision System , 1979, IJCAI.

[4]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[5]  William Grimson,et al.  Object recognition by computer - the role of geometric constraints , 1991 .

[6]  Emmanuel Skordalakis,et al.  Syntactic Pattern Recognition of the ECG , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Esther M. Arkin,et al.  An efficiently computable metric for comparing polygonal shapes , 1991, SODA '90.

[8]  C. Bajaj Algebraic Geometry and its Applications , 1994 .

[9]  D. Mumford Elastica and Computer Vision , 1994 .

[10]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[11]  Kaleem Siddiqi,et al.  Corrections to 'Parts of Visual Form: Computational Aspects' , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Alan L. Yuille,et al.  Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Yali Amit,et al.  Graphical Templates for Model Registration , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Song-Chun Zhu,et al.  FRAME: filters, random fields, and minimax entropy towards a unified theory for texture modeling , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  David W. Jacobs,et al.  Robust and Efficient Detection of Salient Convex Groups , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[17]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Song-Chun Zhu,et al.  Prior Learning and Gibbs Reaction-Diffusion , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Song-Chun Zhu,et al.  GRADE: Gibbs reaction and diffusion equations , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[21]  Ronen Basri,et al.  Completion Energies and Scale , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Sven J. Dickinson,et al.  Generic model abstraction from examples , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Refractor Vision , 2000, The Lancet.

[24]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Serge J. Belongie,et al.  Contour and Texture Analysis for Image , 2001 .

[26]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[27]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[28]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[29]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[30]  Ioannis Tsochantaridis,et al.  Support Vector Machines for Multi ple-Instance Learning , 2002 .

[31]  William T. Freeman,et al.  Nonparametric belief propagation , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[33]  Ben Taskar,et al.  Max-Margin Parsing , 2004, EMNLP.

[34]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[35]  David A. Forsyth,et al.  Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[36]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[38]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[39]  Tony Lindeberg Edge Detection and Ridge Detection with Automatic Scale Selection , 2004, International Journal of Computer Vision.

[40]  Pietro Perona,et al.  Recognition by Probabilistic Hypothesis Construction , 2004, ECCV.

[41]  Song-Chun Zhu,et al.  Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998, International Journal of Computer Vision.

[42]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[43]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[44]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[45]  Sven J. Dickinson,et al.  Generic Model Abstraction from Examples , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Pietro Perona,et al.  A sparse object category model for efficient learning and exhaustive recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  Feng Han,et al.  Bottom-up/top-down image parsing by attribute graph grammar , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[48]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[49]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[50]  Daniel P. Huttenlocher,et al.  Spatial priors for part-based recognition using statistical models , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[51]  Andrew Blake,et al.  Contour-based learning for object detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[52]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[53]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[54]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[55]  Stuart Geman,et al.  Context and Hierarchy in a Probabilistic Image Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[56]  Andrew Zisserman,et al.  A Boundary-Fragment-Model for Object Detection , 2006, ECCV.

[57]  Luc Van Gool,et al.  Object Detection by Contour Segment Networks , 2006, ECCV.

[58]  Iasonas Kokkinos,et al.  Bottom-Up & Top-down Object Detection using Primal Sketch Features and Graphical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[59]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[60]  Narendra Ahuja,et al.  Extracting Subimages of an Unknown Category from a Set of Images , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[61]  Cristian Sminchisescu,et al.  Training Deformable Models for Localization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[62]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[63]  Andrew Zisserman,et al.  Incremental learning of object detectors using a visual shape alphabet , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[64]  Sanja Fidler,et al.  Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  David A. McAllester,et al.  The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[66]  Cordelia Schmid,et al.  Accurate Object Detection with Deformable Shape Models Learnt from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Peter V. Gehler,et al.  Deterministic Annealing for Multiple-Instance Learning , 2007, AISTATS.

[68]  Narendra Ahuja,et al.  Learning the Taxonomy and Models of Categories Present in Arbitrary Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[69]  Song-Chun Zhu,et al.  Deformable Template As Active Basis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[70]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[71]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[73]  Iasonas Kokkinos,et al.  Unsupervised Learning of Object Deformation Models , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[74]  Joshua D. Schwartz,et al.  Hierarchical Matching of Deformable Shapes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Long Zhu,et al.  Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing , 2007, NIPS.

[76]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[77]  Iasonas Kokkinos,et al.  Scale invariance without scale selection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  Yifei Lu,et al.  Max Margin AND/OR Graph learning for parsing the human body , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  Christoph H. Lampert,et al.  Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[80]  Pietro Perona,et al.  Multiple Component Learning for Object Detection , 2008, ECCV.

[81]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[82]  Narendra Ahuja,et al.  Learning subcategory relevances for category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[83]  Andrew Blake,et al.  Image Segmentation by Branch-and-Mincut , 2008, ECCV.

[84]  Sanja Fidler,et al.  Similarity-based cross-layered hierarchical representation for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[85]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[86]  Long Zhu,et al.  Structure-perceptron learning of a hierarchical log-linear model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[87]  Kristen Grauman,et al.  Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[88]  Long Zhu,et al.  Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion , 2008, ECCV.

[89]  Jake Porway,et al.  Object Categorization: Learning Compositional Models for Object Categories from Small Sample Sets , 2008 .

[90]  Jianbo Shi,et al.  Contour Context Selection for Object Detection: A Set-to-Set Contour Matching Approach , 2008, ECCV.

[91]  Cordelia Schmid,et al.  Learning shape prior models for object matching , 2009, CVPR.

[92]  Sven J. Dickinson,et al.  Object Categorization: Computer and Human Vision Perspectives , 2009 .

[93]  Tsuhan Chen,et al.  Unsupervised learning of hierarchical spatial structures in images , 2009, CVPR.

[94]  Pablo Arbeláez,et al.  Recognition using regions , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[95]  Iasonas Kokkinos,et al.  Synergy between Object Recognition and Image Segmentation Using the Expectation-Maximization Algorithm , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96]  C. Schmid,et al.  Learning shape prior models for object matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Cordelia Schmid,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[98]  Iasonas Kokkinos,et al.  Inference and learning with hierarchical compositional models , 2009, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[99]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[100]  M. Zeldin Heuristics! , 2010 .

[101]  Dhiraj Joshi,et al.  Object Categorization: Computer and Human Vision Perspectives , 2008 .

[102]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .