Texture modeling and synthesis using joint statistics of complex wavelet coefficients

We present a statistical characterization of texture images in the context of an overcomplete complex wavelet transform. The characterization is based on empirical observations of statistical regularities in such images, and parameterized by (1) the local autocorrelation of the coeÆcients in each subband; (2) both the local auto-correlation and cross-correlation of coeÆcient magnitudes at other orientations and spatial scales; and (3) the rst few moments of the image pixel histogram. We develop an eÆcient algorithm for synthesizing random images subject to these constraints using alternated projections, and demonstrate its e ectiveness on a wide range of synthetic and natural textures. In particular, we show that many important structural elements in textures (e.g., edges, repeated patterns or alternated patches of simpler texture), can be captured through joint second order statistics of the coeÆcient magnitudes. We also show the exibility of the representation, by applying to a variety of tasks which can be viewed as constrained image synthesis problems, such as spatial and spectral extrapolation. 1Instituto de Optica, CSIC, Serrano 121, 28006 Madrid, SPAIN. iodpm79@pinar2.csic.es 2Center for Neural Science, and Courant Inst. of Mathematical Sciences, New York University, New York, NY 10003. eero.simoncelli@nyu.edu JP is supported by a fellowship from the Consejo Superior de Investigaciones Cienti cas (CSIC), and the Comision Interministerial de Ciencia y Tecnologia (CICYT, Spain), under grant TIC97-325. EPS is supported by an Alfred P. Sloan Research Fellowship, NSF CAREER grant MIP-9796040, and the Sloan Center for Theoretical Neurobiology at NYU. Vision is arguably our most important sensory system, judging from both the ubiquity of visual forms of communication, and the large proportion of the human brain devoted to visual processing. Nevertheless, it has proven diÆcult to establish a good mathematical de nition (in the form of a statistical model) for visual images. The set of visual images is enormous, and yet only a small fraction of these are likely to be encountered in a natural setting [43, 25, 19, 54]. Thus, a statistical prior model, even one that partially captures these variations in likelihood, can substantially bene t image processing and arti cial vision systems. In addition, many authors have proposed that biological visual systems have evolved to take advantage of the statistical properties of the signals to which they are exposed [e.g., 3, 4, 2], thus suggesting a direct link between image statistics and visual processing. In order to characterize visual images statistically, one must make some sort of restriction on the probability model. The most common assumptions are locality (the characterization is speci ed on local spatial neighborhoods), stationarity (the statistics depend only on relative spatial position within the image), and a parametric form for the density (e.g., Gaussian). The subclass of images that we commonly call \visual texture" seems most consistent with local stationary density models. Traditional de nitions of texture can be classi ed into \structural" and \statistical" [34], the rst consisting of a set of repeated deterministic features, and the second corresponding to a sample drawn from a probability density. Such a distinction has turned out to be somewhat arti cial: For example, Zhu et al. [69] have demonstrated that is possible to capture and reproduce structural elements in texture using purely statistical models. Furthermore, many real-world textures seem to incorporate both aspects, in that they can be described as a set of repeating structural elements subject to some randomness in their location, size, color, orientation, etc. This observation leads us to seek a single method of representing textures. In this paper, we develop a fully statistical description, and demonstrate that it is also able to capture and reproduce a wide variety of structural elements. Julesz pioneered the statistical characterization of textures by hypothesizing that the Nth-order joint empirical densities (for some unspeci ed N) of neighborhoods of image pixels, could be used to partition textures into classes that are indistinguishable to a human observer [40]. This work thus established the use of both the locality and stationarity assumptions, the goal of determining a minimal set of statistical constraints, and the validation of texture models using human observers. Since then, researchers have explored a wide variety of approaches for texture characterization and synthesis. One of the most basic distinctions between the various approaches is the choice of representation. Starting with Julesz, many authors have worked directly on the statistical attributes of local spatial neighborhoods of pixels, typically in the form of a Markov random eld [e.g., 35, 42, 17]. But most others attempt to simplify the description of the density by rst processing with a set of linear lters, such as Gabor lters or a multiscale basis. The use of localized multi-scale multi-orientation sets of bandpass lters is inspired by what is known of biological visual processing, and justi ed by recent studies of the higher-order statistical properties of such representations (we discuss this further in section 1). These lters may be held xed, or chosen adaptively depending on the image statistics [e.g., 10, 26, 22, 59, 51]. Assuming a preconditioning subband decomposition, most texture characterizations

[1]  J. Cadzow,et al.  Image texture synthesis-by-analysis using moving-average models , 1993 .

[2]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[3]  Eero P. Simoncelli Statistical models for images: compression, restoration and synthesis , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[4]  Eero P. Simoncelli,et al.  Texture characterization via joint statistics of wavelet coefficient magnitudes , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[5]  A B Watson,et al.  Efficiency of a model human image code. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[6]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  P. Perona,et al.  Detecting and localizing edges composed of steps , 1990 .

[8]  Oscar Nestares,et al.  Texture synthesis‐by‐analysis method based on a multiscale early‐vision model , 1996 .

[9]  Béla Julesz,et al.  Visual Pattern Discrimination , 1962, IRE Trans. Inf. Theory.

[10]  André Gagalowicz,et al.  A New Method for Texture Fields Synthesis: Some Applications to the Study of Human Vision , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Song-Chun Zhu,et al.  Prior Learning and Gibbs Reaction-Diffusion , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Paul A. Viola,et al.  A Non-Parametric Multi-Scale Statistical Model for Natural Images , 1997, NIPS.

[14]  Robert M. Hawlick Statistical and Structural Approaches to Texture , 1979 .

[15]  M. Hassner,et al.  The use of Markov Random Fields as models of texture , 1980 .

[16]  Ibrahim M. Elfadel,et al.  Gibbs Random Fields, Cooccurrences, and Texture Modeling , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Eero P. Simoncelli,et al.  Progressive wavelet image coding based on a conditional probability model , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Rafael Fonolla Navarro,et al.  Robust method for texture synthesis-by-analysis based on a multiscale Gabor scheme , 1996, Electronic Imaging.

[20]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[21]  T.,et al.  Shiftable Multi-scale TransformsEero , 1992 .

[22]  Hans Knutsson,et al.  Texture Analysis Using Two-Dimensional Quadrature Filters , 1983 .

[23]  William E. Higgins,et al.  Texture Segmentation using 2-D Gabor Elementary Functions , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Joseph M. Francos,et al.  A unified texture model based on a 2-D Wold-like decomposition , 1993, IEEE Trans. Signal Process..

[25]  D. Youla,et al.  Image Restoration by the Method of Convex Projections: Part 1ߞTheory , 1982, IEEE Transactions on Medical Imaging.

[26]  Michael Unser,et al.  Texture classification and segmentation using wavelet frames , 1995, IEEE Trans. Image Process..

[27]  D. Field,et al.  Natural image statistics and efficient coding. , 1996, Network.

[28]  Kris Popat,et al.  Cluster-based probability model and its application to image and texture processing , 1997, IEEE Trans. Image Process..

[29]  Olivier D. Faugeras,et al.  Decorrelation Methods of Texture Feature Extraction , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Alfredo Restrepo,et al.  Localized measurement of emergent image frequencies by Gabor wavelets , 1992, IEEE Trans. Inf. Theory.

[31]  Dante C. Youla,et al.  Generalized Image Restoration by the Method of Alternating Orthogonal Projections , 1978 .

[32]  William Bialek,et al.  Statistics of Natural Images: Scaling in the Woods , 1993, NIPS.

[33]  D Kersten,et al.  Predictability and redundancy of natural images. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[34]  Song-Chun Zhu,et al.  FRAME: filters, random fields, and minimax entropy towards a unified theory for texture modeling , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Songde Ma,et al.  Sequential synthesis of natural textures , 1985, Comput. Vis. Graph. Image Process..

[36]  Calvin C. Gotlieb,et al.  Texture descriptors based on co-occurrence matrices , 1990, Comput. Vis. Graph. Image Process..

[37]  Pietro Perona,et al.  Overcomplete steerable pyramid filters and rotation invariance , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Theodosios Pavlidis,et al.  Segmentation by Texture Using Correlation , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Charles A. Bouman,et al.  A multiscale random field model for Bayesian image segmentation , 1994, IEEE Trans. Image Process..

[40]  J.G. Daugman,et al.  Entropy reduction and decorrelation in visual coding by oriented neural receptive fields , 1989, IEEE Transactions on Biomedical Engineering.

[41]  Tomaso Poggio,et al.  Computing texture boundaries from images , 1988, Nature.

[42]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[43]  Harry Wechsler,et al.  Segmentation of Textured Images and Gestalt Organization Using Spatial/Spatial-Frequency Representations , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Olivier D. Faugeras,et al.  Visual Discrimination of Stochastic Texture Fields , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[45]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[46]  Eero P. Simoncelli,et al.  Image compression via joint statistical characterization in the wavelet domain , 1999, IEEE Trans. Image Process..

[47]  Bedrich J. Hosticka,et al.  Unsupervised texture segmentation of images using tuned matched Gabor filters , 1995, IEEE Trans. Image Process..

[48]  Takashi Totsuka,et al.  Combining frequency and spatial domain information for fast interactive image noise removal , 1996, SIGGRAPH.

[49]  N. Graham Visual Pattern Analyzers , 1989 .

[50]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[51]  Tor Lønnestad,et al.  An evaluation of stochastic models for analysis and synthesis of gray-scale texture , 1994, Pattern Recognit. Lett..

[52]  Anil K. Jain,et al.  Markov Random Field Texture Models , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  J. H. van Hateren,et al.  Modelling the Power Spectra of Natural Images: Statistics and Information , 1996, Vision Research.

[54]  Haluk Derin,et al.  Modeling and Segmentation of Noisy and Textured Images Using Gibbs Random Fields , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  E. Adelson,et al.  Early vision and texture perception , 1988, Nature.

[56]  Rosalind W. Picard,et al.  Conjoint probabilistic subband modeling , 1997 .

[57]  Rama Chellappa,et al.  Texture classification using features derived from random field models , 1982, Pattern Recognit. Lett..

[58]  D. Cano,et al.  Texture synthesis using hierarchical linear transforms , 1988 .