Perception Strategies in Hierarchical Vision Systems

Flat appearance-based systems, which combine clever image representations with standard classifiers, might be the most effective way to recognize objects using current technologies. In the future, however, it seems probable that hierarchical representations might have better performance. In such systems, the image representation consists of a sequence of sets of features, where each subsequent set is computed based on the previous sets. The main contributions of this paper are to: (1) pose the question "what is the best way to employ discriminative methods for hierarchical image representations?"; (2) enumerate some of the alternative hierarchies while drawing connections to recent work by brain researchers; (3) study experimentally the different alternatives. As we will show, the strategy used can make a substantial difference.

[1]  David I. Perrett,et al.  Neurophysiology of shape processing , 1993, Image Vis. Comput..

[2]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  D. Scott Perceptual learning. , 1974, Queen's nursing journal.

[5]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[7]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  A. Treisman Features and Objects: The Fourteenth Bartlett Memorial Lecture , 1988, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[9]  J. Bullier Integrated model of visual processing , 2001, Brain Research Reviews.

[10]  Shimon Ullman,et al.  Object Classification Using a Fragment-Based Representation , 2000, Biologically Motivated Computer Vision.

[11]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  J. Hawkins,et al.  On Intelligence , 2004 .

[13]  Takeo Kanade,et al.  Probabilistic modeling of local appearance and spatial relationships for object recognition , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[14]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[15]  Sami Romdhani,et al.  Face identification across different poses and illuminations with a 3D morphable model , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[16]  M. Potter Short-term conceptual memory for pictures. , 1976, Journal of experimental psychology. Human learning and memory.

[17]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[20]  Pietro Perona,et al.  Towards automatic discovery of object categories , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[21]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[22]  Lior Wolf,et al.  Image representations beyond histograms of gradients: The role of Gestalt descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Stanley M. Bileschi,et al.  Advances in component based face detection , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[24]  Lior Wolf,et al.  A Unified System For Object Detection, Texture Recognition, and Context Analysis Based on the Standard Model Feature Set , 2005, BMVC.

[25]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[26]  Tomaso A. Poggio,et al.  Face recognition with support vector machines: global versus component-based approach , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[27]  Azriel Rosenfeld,et al.  From volumes to views: An approach to 3-D object recognition , 1992, CVGIP Image Underst..

[28]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[29]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  S. Hochstein,et al.  View from the Top Hierarchies and Reverse Hierarchies in the Visual System , 2002, Neuron.

[32]  K. Nakayama,et al.  Robust representations for faces: evidence from visual search. , 1999, Journal of experimental psychology. Human perception and performance.