Affine-invariant local descriptors and neighborhood statistics for texture recognition

We present a framework for texture recognition based on local affine-invariant descriptors and their spatial layout. At modelling time, a generative model of local descriptors is learned from sample images using the EM algorithm. The EM framework allows the incorporation of unsegmented multitexture images into the training set. The second modelling step consists of gathering co-occurrence statistics of neighboring descriptors. At recognition time, initial probabilities computed from the generative model are refined using a relaxation step that incorporates co-occurrence statistics. Performance is evaluated on images of an indoor scene and pictures of wild animals.

[1]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[2]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[4]  Tony Lindeberg,et al.  Shape-Adapted Smoothing in Estimation of 3-D Depth Cues from Affine Distortions of Local 2-D Brightness Structure , 1994, ECCV.

[5]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Tony Lindeberg,et al.  Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure , 1997, Image Vis. Comput..

[7]  Carlo Tomasi,et al.  Texture-based image retrieval without segmentation , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[10]  Andrew Zisserman,et al.  Viewpoint invariant texture matching and wide baseline stereo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[11]  Bernhard Schölkopf,et al.  Kernel Methods for Extracting Local Image Semantics , 2001 .

[12]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Martial Hebert,et al.  Probabilistic Classification of Image Regions using an Observation-Constrained Generative Approach , 2002 .

[14]  Andrew Zisserman,et al.  Classifying Images of Materials: Achieving Viewpoint and Illumination Independence , 2002, ECCV.

[15]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[16]  Martial Hebert,et al.  Man-made structure detection in natural images using a causal multiscale random field , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Cordelia Schmid,et al.  A sparse texture representation using affine-invariant regions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[18]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[19]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[20]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.