论文信息 - Foveated object recognition by corner search

Foveated object recognition by corner search

Here we describe a gray scale object recognition system based on foveated corner finding, the computation of sequential fixation points, and elements of Lowe's SIFT transform. The system achieves rotational, transformational, and limited scale invariant object recognition that produces recognition decisions using data extracted from sequential fixation points. It is broken into two logical steps. The first is to develop principles of foveated visual search and automated fixation selection to accomplish corner search. The result is a new algorithm for finding corners which is also a corner-based algorithm for aiming computed foveated visual fixations. In the algorithm, long saccades move the fovea to previously unexplored areas of the image, while short saccades improve the accuracy of putative corner locations. The system is tested on two natural scenes. As an interesting comparison study we compare fixations generated by the algorithm with those of subjects viewing the same images, whose eye movements are being recorded by an eyetracker. The comparison of fixation patterns is made using an information-theoretic measure. Results show that the algorithm is a good locator of corners, but does not correlate particularly well with human visual fixations. The second step is to use the corners located, which meet certain goodness criteria, as keypoints in a modified version of the SIFT algorithm. Two scales are implemented. This implementation creates a database of SIFT features of known objects. To recognize an unknown object, a corner is located and a feature vector created. The feature vector is compared with those in the database of known objects. The process is continued for each corner in the unknown object until enough information has been accumulated to reach a decision. The system was tested on 78 gray scale objects, hand tools and airplanes, and shown to perform well.

[1] J. Alison Noble,et al. Finding Corners , 1988, Alvey Vision Conference.

[2] Alex Pentland,et al. Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Juan Humberto Sossa Azuela,et al. Model-based recognition of planar objects using geometric invariants , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[4] Eileen Kowler,et al. Saccadic localization of random dot targets , 1998, Vision Research.

[5] Todd S. Horowitz,et al. Visual search has no memory , 1998, Nature.

[6] D. E. Irwin,et al. Lexical Processing during Saccadic Eye Movements , 1998, Cognitive Psychology.

[7] Don H. Johnson,et al. Symmetrizing the Kullback-Leibler Distance , 2001 .

[8] David G. Lowe,et al. Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9] Alan C. Bovik,et al. Foveated Visual Search for Corners , 2007, IEEE Transactions on Image Processing.

[10] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[11] Dan Roth,et al. Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] F. Stein,et al. Efficient two dimensional object recognition , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[13] Andrew P. Witkin,et al. Scale-Space Filtering , 1983, IJCAI.

[14] J. Skribanowitz,et al. Dedicated frontends for embedded vision systems , 2002, Proceedings of the 5th Biannual World Automation Congress.

[15] Qiang Ji,et al. Corner detection with covariance propagation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16] Jerry D. Gibson,et al. Handbook of Image and Video Processing , 2000 .

[17] Alan C. Bovik,et al. FOVEA: a foveated vergent active stereo vision system for dynamic three-dimensional scene recovery , 1998, IEEE Trans. Robotics Autom..

[18] Rajesh P. N. Rao,et al. Modeling Saccadic Targeting in Visual Search , 1995, NIPS.

[19] W S Geisler,et al. Sampling-theory analysis of spatial vision. , 1986, Journal of the Optical Society of America. A, Optics and image science.

[20] Zhou Wang,et al. Embedded foveation image coding , 2001, IEEE Trans. Image Process..

[21] B. S. Manjunath,et al. A Condition Number for Point Matching with Application to Registration and Postregistration Error Estimation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[22] Brian Scassellati. A Binocular, Foveated Active Vision System , 1998 .

[23] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[24] Alan C. Bovik,et al. Fast algorithms for foveated video processing , 2003, IEEE Trans. Circuits Syst. Video Technol..

[25] J. Wolfe. Visual memory: What do you know about what you saw? , 1998, Current Biology.

[26] Gérard G. Medioni,et al. Inferring global perceptual contours from local features , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[27] G Sperling,et al. Comparison of perception in the moving and stationary eye. , 1990, Reviews of oculomotor research.

[28] R. W. Rodieck. The First Steps in Seeing , 1998 .

[29] M. Teague. Image analysis via the general theory of moments , 1980 .

[30] David G. Lowe,et al. Scene modelling, recognition and tracking with invariant image features , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[31] Jinhai Cai,et al. Hidden Markov Models with Spectral Features for 2D Shape Recognition , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[32] W. S. Gribble. Slow visual search in a fast-changing world , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[33] D Marr,et al. Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[34] Cordelia Schmid,et al. Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[35] Farzin Mokhtarian,et al. Robust Image Corner Detection Through Curvature Scale Space , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36] Anup Basu,et al. Modelling ecologically specialized biological visual systems , 1997, Pattern Recognit..

[37] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[38] PoggioTomaso,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007 .

[39] Ian D. Reid,et al. Tracking foveated corner clusters using affine structure , 1993, 1993 (4th) International Conference on Computer Vision.

[40] Ana L. N. Fred,et al. Hidden Markov models vs. syntactic modeling in object recognition , 1997, Proceedings of International Conference on Image Processing.

[41] David E. Irwin Robert D. Gordon. Eye Movements, Attention and Trans-saccadic Memory , 1998 .

[42] John K. Tsotsos. On the relative complexity of active vs. passive visual search , 2004, International Journal of Computer Vision.

[43] Yuk Ying Chung,et al. Neural network based image recognition system using geometrical moment , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[44] Lambert E. Wixson,et al. Using intermediate objects to improve the efficiency of visual search , 1994, International Journal of Computer Vision.

[45] Han Wang,et al. Gray Level Corner Detection , 1998, MVA.

[46] Wilson S. Geisler,et al. Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[47] Pietro Perona,et al. One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[49] Marios S. Pattichis,et al. Foveated video compression with optimal rate control , 2001, IEEE Trans. Image Process..

[50] Ian D. Reid,et al. Reactions to peripheral image motion using a head/eye platform , 1993, 1993 (4th) International Conference on Computer Vision.

[51] Stephen M. Smith,et al. SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[52] Refractor. Vision , 2000, The Lancet.

[53] J. Wolfe. Moving towards solutions to some enduring controversies in visual search , 2003, Trends in Cognitive Sciences.

[54] J. V. van Hateren,et al. Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[55] P.W.M. Tsang,et al. A genetic algorithm for projective invariant object recognition , 1996, Proceedings of Digital Processing Applications (TENCON '96).

[56] Anil K. Jain,et al. On reliable curvature estimation , 1989, CVPR.

[57] W. Geisler,et al. Separation of low-level and high-level factors in complex tasks: visual search. , 1995, Psychological review.

[58] Rajesh P. N. Rao,et al. Learning Saccadic Eye Movements Using Multiscale Spatial Filters , 1994, NIPS.

[59] Heiner Deubel,et al. The Subjective Direction of Gaze Shifts Long Before the Saccade , 1999 .

[60] Ming-Kuei Hu,et al. Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[61] Scott Helmer. Object Recognition with Many Local Features , 2004 .

[62] Zhou Wang,et al. Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..

[63] Alan C. Bovik,et al. Point-of-gaze analysis reveals visual search strategies , 2004, IS&T/SPIE Electronic Imaging.

[64] Alan C. Bovik,et al. GAFFE: A Gaze-Attentive Fixation Finding Engine , 2008, IEEE Transactions on Image Processing.

[65] Yasuo Kuniyoshi,et al. A foveated wide angle lens for active vision , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[66] Jayanta Basak,et al. A connectionist model for corner detection in binary and gray images , 2000, IEEE Trans. Neural Networks Learn. Syst..

[67] J. Wolfe,et al. Changing your mind: on the contributions of top-down and bottom-up guidance in visual search for feature singletons. , 2003, Journal of experimental psychology. Human perception and performance.

[68] Alan C. Bovik,et al. Experiments in segmenting texton patterns using localized spatial filters , 1989, Pattern Recognit..

[69] Dongxiang Zhou,et al. An efficient and robust corner detection algorithm , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[70] Wilson S. Geisler,et al. Color as a source of information in the stereo correspondence process , 1990, Vision Research.

[71] Wilson S. Geisler,et al. Real-time foveated multiresolution system for low-bandwidth video communication , 1998, Electronic Imaging.

[72] H. Collewijn,et al. The function of visual search and memory in sequential looking tasks , 1995, Vision Research.

[73] David E. Irwin,et al. Evidence against visual integration across saccadic eye movements , 1983, Perception & psychophysics.

[74] Roland T. Chin,et al. Scale-Based Detection of Corners of Planar Curves , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[75] A. L. Yarbus,et al. Eye Movements and Vision , 1967, Springer US.

[76] Carlo Tomasi,et al. Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[77] Peter J. Burt,et al. Smart sensing within a pyramid vision machine , 1988, Proc. IEEE.

[78] Wilson S. Geisler,et al. Implementation of a foveated image coding system for image bandwidth reduction , 1996, Electronic Imaging.

[79] Eileen Kowler. The role of visual and cognitive processes in the control of eye movement. , 1990, Reviews of oculomotor research.

[80] Stan Sclaroff,et al. Deformable prototypes for encoding shape categories in image databases , 1995, Pattern Recognit..

[81] Wilson S. Geisler,et al. Texture segmentation using Gabor modulation/demodulation , 1987, Pattern Recognit. Lett..

[82] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.

[83] Rajesh P. N. Rao,et al. Multiscale filter bank approach to camera-movement control in active vision systems , 1994, Other Conferences.

[84] Wilson S. Geisler,et al. Optimal eye movement strategies in visual search , 2005, Nature.

[85] Alex Pentland,et al. Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[86] Alan C Bovik,et al. Contrast statistics for foveated visual systems: fixation selection by minimizing contrast entropy. , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.

[87] James J. Clark,et al. Modal Control Of An Attentive Vision System , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[88] Azriel Rosenfeld,et al. Computer Vision , 1988, Adv. Comput..

[89] Irene Y. H. Gu,et al. Corner-based feature extraction for object retrieval , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[90] Emanuele Trucco,et al. Making good features track better , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[91] J. Canny. Finding Edges and Lines in Images , 1983 .

[92] Claudio M. Privitera,et al. Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..