SEEMORE: a view-based approach to 3-D object recognition using multiple visual cues

A view-based, high-dimensional feature-space recognition system called SEEMORE was developed as a testbed to explore the representational trade-offs that arise when a simple feedforward neural architecture is challenged with a difficult 3D object recognition problem. Particular emphasis was placed on designing an object representation that could: 1) cope with a large number of real 3D objects of many different types; 2) operate directly on input images without shift, scale, or other object pre-normalization steps; 3) integrate multiple visual cues; and 4) recognize objects over 6 degrees of freedom of viewpoint, gross non-rigid shape distortions, and/or partial occulsion. Recognition results were obtained using a set of 102 color and shape feature channels, each designed to be invariant to image plane shifts and rotations, and only modestly sensitive to orientation in depth. In response to a test set of 600 novel test views of 100 objects presented individually in color video images, SEEMORE identified the object correctly 97% of the time using a nearest neighbour classifier. Similar levels of performance were obtained for the subset of 15 non-rigid objects.

[1]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[2]  Rakesh Mohan,et al.  Multidimensional indexing for recognizing visual shapes , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Takayuki Ito,et al.  Neocognitron: A neural network model for a mechanism of visual pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  D. Perrett,et al.  Time course of neural responses discriminating different views of the face and head. , 1992, Journal of neurophysiology.

[5]  S. Edelman,et al.  Orientation dependence in the recognition of familiar and novel views of three-dimensional objects , 1992, Vision Research.

[6]  Keiji Tanaka,et al.  Coding visual images of objects in the inferotemporal cortex of the macaque monkey. , 1991, Journal of neurophysiology.

[7]  Yann LeCun,et al.  Handwritten zip code recognition with multilayer networks , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[8]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .