Visual learning and recognition of 3-d objects from appearance

The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image.A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.

[1]  Alston S. Householder,et al.  The Theory of Matrices in Numerical Analysis , 1964 .

[2]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..

[3]  B. V. K. Vijaya Kumar,et al.  Efficient Calculation of Primary Images from a Set of Images , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Erkki Oja,et al.  Subspace methods of pattern recognition , 1983 .

[5]  Ramesh C. Jain,et al.  Three-dimensional object recognition , 1985, CSUR.

[6]  Charles R. Dyer,et al.  Model-based recognition in robot vision , 1986, CSUR.

[7]  L Sirovich,et al.  Low-dimensional Procedure for the Characterization of Human Faces , 1986 .

[8]  G. Medioni,et al.  Recognizing 3-D Objects Using Surface Descriptions , 1989, [1988 Proceedings] Second International Conference on Computer Vision.

[9]  Tapan K. Sarkar,et al.  A survey of conjugate gradient algorithms for solution of extreme eigen-problems of a symmetric matrix , 1989, IEEE Trans. Acoust. Speech Signal Process..

[10]  S. Edelman,et al.  Stimulus Familiarity Determines Recognition Strategy for Novel 3D Objects , 1989 .

[11]  M. Tarr,et al.  Mental rotation and orientation-dependence in shape recognition , 1989, Cognitive Psychology.

[12]  Ramakant Nevatia,et al.  Recognizing 3-D Objects Using Surface Descriptions , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[14]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[15]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Katsushi Ikeuchi,et al.  Recognizing assembly tasks using face-contact relations , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Hiroshi Murase,et al.  Learning, positioning, and tracking visual appearance , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[18]  Hiroshi Murase,et al.  Illumination planning for object recognition in structured environments , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[19]  William H. Press,et al.  Numerical recipes in C , 2002 .

[20]  J. Koenderink,et al.  The internal representation of solid shape with respect to vision , 1979, Biological Cybernetics.

[21]  Shimon Ullman,et al.  Recognizing solid objects by alignment with an image , 1990, International Journal of Computer Vision.

[22]  Daphna Weinshall,et al.  A self-organizing multiple-view representation of 3D objects , 2004, Biological Cybernetics.

[23]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .