Mapping the Ensemble of Natural Image Patches by Explicit and Implicit Manifolds

Image patches are fundamental elements for object modeling and recognition. However, there has not been a panoramic study in the literature on the structures of the whole ensemble of natural image patches. In this article, we study the mathematical structures of the ensemble of natural image patches and map image patches into two types of subspaces which we call “explicit manifolds” and “implicit manifolds” respectively. On explicit manifolds, one finds those simple and regular image primitives, such as edges, bars, corners and junctions. On implicit manifolds, one finds those complex and stochastic image patches, such as textures and clutters. On these manifolds, different perceptual metrics are used. Then we show a unified framework for learning a probabilistic model on the space of patches by pursuing both types of manifolds under a common information theoretical principle. The connection between the two types of manifolds are realized through image scaling which changes the entropy of the image patches. The explicit manifolds live in low entropy regimes while the implicit manifolds live in high entropy regimes. In experiments, we cluster the natural image patches and compare the two types of manifolds with a common information theoretical criterion. We also study the transition of the manifolds over scales and show that the complexity peak in a middle entropy regime where most objects and parts reside.

[1]  D. Geman,et al.  Invariant Statistics and Coding of Natural Microimages , 1998 .

[2]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[3]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Steven W. Zucker,et al.  Local Scale Control for Edge Detection and Blur Estimation , 1996, ECCV.

[5]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[6]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Benjamin Z. Yao,et al.  Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks , 2007, EMMCVPR.

[8]  D. Mumford,et al.  Optimal approximations by piecewise smooth functions and associated variational problems , 1989 .

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  John Wright,et al.  Segmentation of multivariate mixed data via lossy coding and compression , 2007, Electronic Imaging.

[12]  Song-Chun Zhu,et al.  Primal sketch: Integrating structure and texture , 2007, Comput. Vis. Image Underst..

[13]  Kim Steenstrup Pedersen,et al.  The Nonlinear Statistics of High-Contrast Patches in Natural Images , 2003, International Journal of Computer Vision.

[14]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..