A multi-scale generative model for animate shapes and parts

We present a multiscale generative model for representing animate shapes and extracting meaningful parts of objects. The model assumes that animate shapes (2D simple dosed curves) are formed by a linear superposition of a number of shape bases. These shape bases resemble the multiscale Gabor bases in image pyramid representation, are well localized in both spatial and frequency domains, and form an over-complete dictionary. This model is simpler than the popular B-spline representation since it does not engage a domain partition. Thus it eliminates the interference between adjacent B-spline bases, and becomes a true linear additive model. We pursue the bases by reconstructing the shape in a coarse-to-fine procedure through curve evolution. These shape bases are further organized in a tree-structure, where the bases in each subtree sum up to an intuitive part of the object. To build probabilistic model for a class of objects, we propose a Markov random field model at each level of the tree representation to account for the spatial relationship between bases. Thus the final model integrates a Markov tree (generative) model over scales and a Markov random field over space. We adopt EM-type algorithm for learning the meaningful parts for a shape class, and show some results on shape synthesis.

[1]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[2]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[3]  PaperNo Recognition of shapes by editing shock graphs , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  T.,et al.  Shiftable Multi-scale TransformsEero , 1992 .

[5]  Edward H. Adelson,et al.  Shiftable multiscale transforms , 1992, IEEE Trans. Inf. Theory.

[6]  Farzin Mokhtarian,et al.  A Theory of Multiscale, Curvature-Based Shape Representation for Planar Curves , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  B KimiaBenjamin,et al.  Shapes, shocks, and deformations I , 1995 .

[8]  Robert D. Nowak,et al.  Wavelet-based statistical signal processing using hidden Markov models , 1998, IEEE Trans. Signal Process..

[9]  H. Blum Biological shape and visual science. I. , 1973, Journal of theoretical biology.

[10]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Alan L. Yuille,et al.  FORMS: A flexible object recognition and modelling system , 1995, Proceedings of IEEE International Conference on Computer Vision.

[12]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[13]  H. Blum Biological shape and visual science (part I) , 1973 .

[14]  Song-Chun Zhu,et al.  What are Textons? , 2005 .

[15]  James S. Duncan,et al.  Boundary Finding with Parametrically Deformable Models , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Benjamin B. Kimia,et al.  Shapes, shocks, and deformations I: The components of two-dimensional shape and the reaction-diffusion space , 1995, International Journal of Computer Vision.

[17]  F. Bookstein Size and Shape Spaces for Landmark Data in Two Dimensions , 1986 .

[18]  Alan L. Yuille,et al.  FORMS: A flexible object recognition and modelling system , 1996, International Journal of Computer Vision.

[19]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .