Dimension Reduction and Classification Methods for Object Recognition in Vision

This paper addresses the challenging task of recognizing and locating objects in natural images. In computer vision, many successful approaches to object recognition use local image descriptors. Such descriptors do not require segmentation, in addition they are robust to partial occlusion and invariant to image transformations (particularly scale changes). Among the existing descriptors, a recent comparison [4] showed that the SIFT descriptor [2] was particularly robust. However, the SIFT descriptor is high-dimensional (typically 128-dimensional) and this penalizes classification. In this paper, we propose to use statistical dimension reduction techniques to obtain a more discriminant representation of data, in order to increase recognition results. We will first describe the two stages of the recognition process (See Fig. 1), learning and recognition, then we will present experimental results obtained on motorbikes images.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  L. Saul,et al.  An Introduction to Locally Linear Embedding , 2001 .

[3]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[4]  C. Schmid,et al.  Indexing based on scale invariant interest points , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[6]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[7]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.