论文信息 - Robust Object Recognition with Cortex-Like Mechanisms

Robust Object Recognition with Cortex-Like Mechanisms

We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex

[1] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[2] P. Schiller,et al. Quantitative studies of single-cell properties in monkey striate cortex. III. Spatial frequency. , 1976, Journal of neurophysiology.

[3] P. Schiller,et al. Quantitative studies of single-cell properties in monkey striate cortex. II. Orientation specificity and ocular dominance. , 1976, Journal of neurophysiology.

[4] D Marr,et al. A computational theory of human stereo vision. , 1979, Proceedings of the Royal Society of London. Series B, Biological sciences.

[5] T. Poggio,et al. A computational theory of human stereo vision , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[6] D Marr,et al. Bandpass channels, zero-crossings, and early visual information processing. , 1979, Journal of the Optical Society of America.

[7] A. Treisman,et al. A feature-integration theory of attention , 1980, Cognitive Psychology.

[8] R. L. Valois,et al. The orientation and direction selectivity of cells in macaque visual cortex , 1982, Vision Research.

[9] D. G. Albrecht,et al. Spatial frequency selectivity of cells in macaque visual cortex , 1982, Vision Research.

[10] J. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[11] Tomaso A. Poggio,et al. On Edge Detection , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] J. P. Jones,et al. An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[13] Wilson S. Geisler,et al. Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[14] David I. Perrett,et al. Neurophysiology of shape processing , 1993, Image Vis. Comput..

[15] N. Logothetis,et al. Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[16] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[17] Tony Lindeberg,et al. Scale-space theory : A framework for handling image structures at multiple scales , 1996 .

[18] Denis Fize,et al. Speed of processing in the human visual system , 1996, Nature.

[19] Bartlett W. Mel. SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[20] E. Rolls,et al. INVARIANT FACE AND OBJECT RECOGNITION IN THE VISUAL SYSTEM , 1997, Progress in Neurobiology.

[21] J. Wolfe,et al. Preattentive Object Files: Shapeless Bundles of Basic Features , 1997, Vision Research.

[22] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23] Hartmut Neven,et al. The Bochum/USC Face Recognition System And How it Fared in the FERET Phase III Test , 1998 .

[24] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[25] Jitendra Malik,et al. Blobworld: A System for Region-Based Image Indexing and Retrieval , 1999, VISUAL.

[26] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27] Pietro Perona,et al. Unsupervised Learning of Models for Recognition , 2000, ECCV.

[28] Edmund T. Rolls,et al. A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures , 2000, Neural Computation.

[29] Tomaso A. Poggio,et al. Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[30] Thomas Serre,et al. Categorization by Learning and Combining Object Parts , 2001, NIPS.

[31] Peter Meer,et al. Synergism in low level vision , 2002, Object recognition supported by user interaction for service robots.

[32] Thomas Serre,et al. On the Role of Object-Specific Features for Real World Object Recognition in Biological Vision , 2002, Biologically Motivated Computer Vision.

[33] Terence Sim,et al. The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[34] Simon J. Thorpe,et al. Ultra-Rapid Scene Categorization with a Wave of Spikes , 2002, Biologically Motivated Computer Vision.

[35] Michel Vidal-Naquet,et al. Visual features of intermediate complexity and their use in classification , 2002, Nature Neuroscience.

[36] T. Gawne,et al. Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. , 2002, Journal of neurophysiology.

[37] B. Schiele,et al. Interleaved Object Categorization and Segmentation , 2003, BMVC.

[38] Tai Sing Lee,et al. Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[39] Y. Amit,et al. An integrated network for invariant visual detection and recognition , 2003, Vision Research.

[40] Pietro Perona,et al. Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[41] Heiko Wersing,et al. Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[42] Tomaso Poggio,et al. Intracellular measurements of spatial integration and the MAX operation in complex cells of the cat primary visual cortex. , 2004, Journal of neurophysiology.

[43] Thomas Serre,et al. Realistic Modeling of Simple and Complex Cell Tuning in the HMAX Model, and Implications for Invariant Object Recognition in Cortex , 2004 .

[44] B. Schiele,et al. Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[45] A. Torralba,et al. Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[46] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[47] T. Sanger,et al. Stereo disparity computation using Gabor filters , 1988, Biological Cybernetics.

[48] Brian Leung,et al. Component-based Car Detection in Street Scene Images , 2004 .

[49] Tomaso Poggio,et al. Generalization in vision and motor control , 2004, Nature.

[50] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[51] Jitendra Malik,et al. When is scene identification just texture recognition? , 2004, Vision Research.

[52] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[53] Thomas Serre,et al. Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[54] Cordelia Schmid,et al. A maximum entropy framework for part-based texture and object recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[55] Lior Wolf,et al. A Unified System For Object Detection, Texture Recognition, and Context Analysis Based on the Standard Model Feature Set , 2005, BMVC.

[56] Jitendra Malik,et al. Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[57] A. Treisman,et al. Perception of objects in natural scenes: is it really attention free? , 2005, Journal of experimental psychology. Human perception and performance.

[58] Jean Ponce,et al. The Local Projective Shape of Smooth Surfaces and Their Outlines , 2005, International Journal of Computer Vision.

[59] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] Tomaso Poggio,et al. Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.

[61] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[62] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[63] Alex Holub,et al. Exploiting Unlabelled Data for Hybrid Object Classification , 2005 .

[64] Thomas Serre,et al. A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex , 2005 .

[65] Peter Auer,et al. Generic object recognition with boosting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66] Stanley M. Bileschi,et al. Street Scenes: towards scene understanding in still images , 2006 .

[67] Lior Wolf,et al. Perception Strategies in Hierarchical Vision Systems , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[68] Manuel J. Marín-Jiménez,et al. Empirical Study of Multi-scale Filter Banks for Object Categorization , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[69] Tomaso Poggio,et al. Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines , 2006 .

[70] David G. Lowe,et al. Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[71] C. Scott,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2009 .

[72] RussLL L. Ds Vnlos,et al. SPATIAL FREQUENCY SELECTIVITY OF CELLS IN MACAQUE VISUAL CORTEX , 2022 .