Attentional selection in object recognition

A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial 3e•rch involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaling conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent we in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object. Thesis Supervisor: W. Eric L. Grimson Title: Associate Professor of Electrical Engineering and Computer Science A little learning s a dangeru thing; Drink deep, or taste not the Pierian sprng, There sallw draught. intoxicate the brain, And drinking largely sobers us again. Fired at first eight with what the Muse imparts, In fearleus youth we tempt the heights of Arts, White from the bounded leuel of our mind, Short views we take, nor see the lengths behind; But more aduvanced, behold with strange aurp, New distant scenes of endless science rise! So pleased at first the towering Alp. we try, Mount o'er the vales, and seem to tread the sky, Th' eternal snows appear already past, And the first clouds and mountains seem the lut; But, those attained, we tremble to survey The growing labors of the lengthened way, The' incresing prospect tire. our wandering eyes, Hills peep o'er hill., and Alp. on Alp. arise!

[1]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[2]  Steven A. Shafer,et al.  Using color to separate reflection components , 1985 .

[3]  Shimon Ullman,et al.  Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[4]  Tomaso Poggio,et al.  Computing texture boundaries from images , 1988, Nature.

[5]  Stéphane Mallat,et al.  Multifrequency channel decompositions of images and wavelet models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6]  Surendra Ranganath,et al.  Two-dimensional linear prediction models-part I: Spectral factorization and realization , 1985, IEEE Trans. Acoust. Speech Signal Process..

[7]  D. Huttenlocher Three-Dimensional Recognition of Solid Objects from a Two- Dimensional Image , 1988 .

[8]  Andrew K. C. Wong,et al.  A texture information-directed region growing algorithm for image segmentation and region classification , 1988, Comput. Vis. Graph. Image Process..

[9]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[10]  Larry S. Davis,et al.  Texture Analysis Using Generalized Co-Occurrence Matrices , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ramakant Nevatia,et al.  Perceptual organization for segmentation and description , 1989 .

[12]  Annita Tuller A modern introduction to geometries , 1967 .

[13]  Fumiaki Tomita,et al.  Computer analysis of visual textures , 1990 .

[14]  E. Land Recent advances in retinex theory , 1986, Vision Research.

[15]  Donald P. Greenberg,et al.  Perceptual color spaces for computer graphics , 1980, SIGGRAPH '80.

[16]  James V. Mahoney,et al.  Image Chunking: Defining Spatial Building Blocks for Scene Analysis , 1987 .

[17]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[18]  Harry Wechsler,et al.  Segmentation of Textured Images and Gestalt Organization Using Spatial/Spatial-Frequency Representations , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  D REYNOLDS,et al.  EFFECTS OF DOUBLE STIMULATION: TEMPORARY INHIBITION OF RESPONSE. , 1964, Psychological bulletin.

[20]  T. Poggio,et al.  Spotlight on attention , 1985, Trends in Neurosciences.

[21]  T. Kanade,et al.  USING A COLOR REFLECTION MODEL TO SEPARATE HIGHLIGHTS FROM OBJECT COLOR , 1987 .

[22]  T. Sato,et al.  Effects of attention and stimulus interaction on visual responses of inferior temporal neurons in macaque. , 1988, Journal of neurophysiology.

[23]  S. Ullman Visual routines , 1984, Cognition.

[24]  A. Treisman Preattentive processing in vision , 1985, Comput. Vis. Graph. Image Process..

[25]  Charles W. Therrien,et al.  An estimation-theoretic approach to terrain image segmentation , 1983, Comput. Vis. Graph. Image Process..

[26]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[27]  T. Kanade,et al.  Color information for region segmentation , 1980 .

[28]  D. W. Thompson,et al.  Three-dimensional model matching from an unconstrained viewpoint , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[29]  Tomaso Poggio,et al.  Visual Attention in Brains and Computers , 1986 .

[30]  Gunther Wyszecki,et al.  Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .

[31]  S. Ullman,et al.  Grouping Contours by Iterated Pairing Network , 1990, NIPS 1990.

[32]  A. Shashua Correspondence and Affine Shape from Two Orthographic Views: Motion and Recognition , 1991 .

[33]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[34]  A. Witkin,et al.  On the Role of Structure in Vision , 1983 .

[35]  W. Neill,et al.  Decision processes in selective attention: Response priming in the Stroop color-word task , 1978, Perception & psychophysics.

[36]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[37]  R. Desimone,et al.  Selective attention gates visual processing in the extrastriate cortex. , 1985, Science.

[38]  A. Treisman The Role of Attention in Object Perception , 1983 .

[39]  Christof Koch,et al.  Selecting One Among the Many: A Simple Network Implementing Shifts in Selective Visual Attention , 1984 .

[40]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[41]  Azriel Rosenfeld,et al.  Some experiments in image segmentation by clustering of local feature values , 1979, Pattern Recognit..

[42]  L. Maloney,et al.  Color constancy: a method for recovering surface spectral reflectance , 1987 .

[43]  Ken Nakayama,et al.  Serial and parallel processing of visual feature conjunctions , 1986, Nature.

[44]  David W. Jacobs The Use of Grouping in Visual Object Recognition , 1988 .

[45]  Jitendra Malik,et al.  A Computational Model Of Texture Segmentation , 1988, Twenty-Second Asilomar Conference on Signals, Systems and Computers.

[46]  Yuichi Ohta,et al.  An approach to color constancy using multiple images , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[47]  A. Hurlbert The Computation of Color , 1989 .

[48]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  R. A. Kinchla,et al.  Attending to different levels of structure in a visual image , 1983, Perception & psychophysics.

[50]  H. Voorhees Finding Texture Boundaries in Images , 1987 .

[51]  A. Treisman SELECTIVE ATTENTION IN MAN. , 1964, British medical bulletin.

[52]  R M Boynton,et al.  Uniqueness of perceived hues investigated with a continuous judgmental technique. , 1966, Journal of experimental psychology.

[53]  Peter de Souza,et al.  Texture recognition via autoregression , 1982, Pattern Recognit..

[54]  Huang Yumin,et al.  A PHYSICAL APPROACH TO COLOR IMAGE UNDERSTANDING , 1991 .