Textons, contours and regions: cue integration in image segmentation

The paper makes two contributions: it provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in contour orientation energy and differences in texton densities across the contour. B. Julesz (1981) introduced the term texton, analogous to a phoneme in speech recognition, but did not provide an operational definition for gray-level images. We re-invent textons as frequently co-occurring combinations of oriented linear filter outputs. These can be learned using a K-means approach. By mapping each pixel to its nearest texton, the image can be analyzed into texton channels, each of which is a point set where discrete techniques such as Voronoi diagrams become applicable. Local histograms of texton frequencies can be used with a /spl chi//sup 2/ test for significant differences to find texture boundaries. Natural images contain both textured and untextured regions, so we combine this cue with that of the presence of peaks of contour energy derived from outputs of odd- and even-symmetric oriented Gaussian derivative filters. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on a statistical test for isotropy of Delaunay neighbors. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown.

[1]  P. O. Bishop,et al.  Spatial vision. , 1971, Annual review of psychology.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  B. Julesz Textons, the elements of texture perception, and their interactions , 1981, Nature.

[4]  Thomas O. Binford,et al.  Inferring Surfaces from Images , 1981, Artif. Intell..

[5]  Hans Knutsson,et al.  Texture Analysis Using Two-Dimensional Quadrature Filters , 1983 .

[6]  P Perona,et al.  Preattentive texture discrimination with early vision mechanisms , 1990 .

[7]  Jitendra Malik,et al.  Detecting and localizing edges composed of steps, peaks and roofs , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[8]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[9]  Jitendra Malik,et al.  A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters , 1991, ECCV.

[10]  Jitendra Malik,et al.  Computational framework for determining stereo correspondence from a set of linear spatial filters , 1992, Image Vis. Comput..

[11]  N. Fisher,et al.  Statistical Analysis of Circular Data , 1993 .

[12]  Joachim M. Buhmann,et al.  Non-parametric similarity measures for unsupervised texture segmentation and image retrieval , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Jitendra Malik,et al.  Contour Continuity in Region Based Image Segmentation , 1998, ECCV.

[15]  Jitendra Malik,et al.  Color- and texture-based image segmentation using EM and its application to content-based image retrieval , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[16]  Jitendra Malik,et al.  Recognizing surfaces using three-dimensional textons , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.