Analysis of Cluttered Scenes Using an Elastic Matching Approach for Stereo Images

We present a system for the automatic interpretation of cluttered scenes containing multiple partly occluded objects in front of unknown, complex backgrounds. The system is based on an extended elastic graph matching algorithm that allows the explicit modeling of partial occlusions. Our approach extends an earlier system in two ways. First, we use elastic graph matching in stereo image pairs to increase matching robustness and disambiguate occlusion relations. Second, we use richer feature descriptions in the object models by integrating shape and texture with color features. We demonstrate that the combination of both extensions substantially increases recognition performance. The system learns about new objects in a simple one-shot learning approach. Despite the lack of statistical information in the object models and the lack of an explicit background model, our system performs surprisingly well for this very difficult task. Our results underscore the advantages of view-based feature constellation representations for difficult object recognition problems.

[1]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Azriel Rosenfeld,et al.  Image analysis: Problems, progress and prospects , 1984, Pattern Recognit..

[5]  J. P. Jones,et al.  An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.

[6]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[7]  Joachim M. Buhmann,et al.  Size and distortion invariant object recognition by hierarchical graph matching , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[8]  A. Yuille Deformable Templates for Face Recognition , 1991, Journal of Cognitive Neuroscience.

[9]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[10]  William K. Pratt,et al.  Digital image processing (2nd ed.) , 1991 .

[11]  M. Peterson,et al.  Shape recognition contributions to figure-ground reversal: which route counts? , 1991, Journal of experimental psychology. Human perception and performance.

[12]  Christoph von der Malsburg,et al.  A Neural System for the Recognition of Partially Occluded Objects in Cluttered Scenes: A Pilot Study , 1993, Int. J. Pattern Recognit. Artif. Intell..

[13]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[14]  B. Gibson,et al.  Must Figure-Ground Organization Precede Object Recognition? An Assumption in Peril , 1994 .

[15]  Shimon Edelman,et al.  Representation of Similarity in Three-Dimensional Object Discrimination , 1995, Neural Computation.

[16]  Pietro Perona,et al.  Recognition of planar object classes , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  David G. Lowe,et al.  Learning Appearance Models for Object Recognition , 1996, Object Representation in Computer Vision.

[18]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[19]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[20]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  S. Ullman Three-dimensional object recognition based on the combination of views , 1998, Cognition.

[22]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Hartmut Neven,et al.  PersonSpotter-fast and robust system for human detection, tracking and recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[24]  Jochen Triesch,et al.  Object Recognition with Multiple Feature Types , 1998 .

[25]  R. Nelson,et al.  Large-scale tests of a keyed, appearance-based 3-D object recognition system , 1998, Vision Research.

[26]  Horst Bunke,et al.  A New Algorithm for Error-Tolerant Subgraph Isomorphism Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  S Edelman,et al.  Representation is representation of similarities , 1996, Behavioral and Brain Sciences.

[28]  Heinrich H Bülthoff,et al.  Image-based object recognition in man, monkey and machine , 1998, Cognition.

[29]  D. Mumford,et al.  The role of the primary visual cortex in higher level vision , 1998, Vision Research.

[30]  R. Harwerth,et al.  Effects of Cue Context on the Perception of Depth from Combined Disparity and Perspective Cues , 1998, Optometry and vision science : official publication of the American Academy of Optometry.

[31]  T. Wilcox Object individuation: infants’ use of shape, size, pattern, and color , 1999, Cognition.

[32]  Jochen Triesch,et al.  Vision based robotic gesture recognition , 1999 .

[33]  M. Peterson Knowledge and intention can penetrate early vision , 1999, Behavioral and Brain Sciences.

[34]  Stefan Fischer,et al.  Face authentication with Gabor information on deformable graphs , 1999, IEEE Trans. Image Process..

[35]  G. Deco,et al.  A hierarchical neural system with attentional top–down enhancement of the spatial resolution for object recognition , 2000, Vision Research.

[36]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[37]  Kunihiko Fukushima,et al.  Active and Adaptive Vision: Neural Network Models , 2000, Biologically Motivated Computer Vision.

[38]  Tomaso Poggio,et al.  Models of object recognition , 2000, Nature Neuroscience.

[39]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[40]  Jochen Triesch,et al.  A System for Person-Independent Hand Posture Recognition against Complex Backgrounds , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[42]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[43]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[44]  Jitendra Malik,et al.  A Probabilistic Multi-scale Model for Contour Completion Based on Image Statistics , 2002, ECCV.

[45]  Jochen Triesch,et al.  Classification of hand postures against complex backgrounds using elastic graph matching , 2002, Image Vis. Comput..

[46]  C. Malsburg,et al.  The role of complex cells in object recognition , 2002, Vision Research.

[47]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[48]  Shimon Ullman,et al.  Combining Top-Down and Bottom-Up Segmentation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[49]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[50]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[51]  D. Mumford On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[52]  David G. Lowe,et al.  Probabilistic Models of Appearance for 3-D Object Recognition , 2000, International Journal of Computer Vision.

[53]  D. Mumford,et al.  On the computational architecture of the neocortex , 2004, Biological Cybernetics.

[54]  Pietro Perona,et al.  Recognition by Probabilistic Hypothesis Construction , 2004, ECCV.

[55]  Donald Geman,et al.  Coarse-to-Fine Face Detection , 2004, International Journal of Computer Vision.

[56]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[57]  Shimon Ullman,et al.  Learning to Segment , 2004, ECCV.

[58]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[59]  Jochen Triesch,et al.  OBJECT RECOGNITION WITH DEFORMABLE FEATURE GRAPHS: FACES, HANDS, AND CLUTTERED SCENES , 2005 .

[60]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[61]  Flexible Object Recognition for a Grasping Robot , 2022 .