Generic object recognition by combining distinct features in machine learning

In a generic image object recognition or categorization system, the relevant features or descriptors from a characteristic point, patch or region of an image are often obtained by different approaches. These features are often separately selected and learned by machine learning methods. In this paper, the relation between distinct features obtained by different feature extraction approaches from the same original images was studied by Kernel Canonical Correlation Analysis (KCCA). We apply a Support Vector Machine (SVM) classifier in the learnt semantic space of the combined features and compare against SVM on the raw data and previously published state-of-the-art results. Experiment show that significant improvement is achieved with the SVM in the semantic space in comparison with direct SVM classification on the raw data.

[1]  David R. Hardoon,et al.  LEARNING THE SEMANTICS OF MULTIMEDIA CONTENT WITH APPLICATION TO WEB IMAGE RETRIEVAL AND CLASSIFICATION , 2003 .

[2]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[4]  David R. Hardoon,et al.  KCCA for different level precision in content-based image retrieval , 2003 .

[5]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[6]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[7]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[8]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[9]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[10]  Ole Winther,et al.  Independent component analysis for understanding multimedia content , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[11]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[12]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[13]  Nello Cristianini,et al.  Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis , 2002, NIPS.

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[16]  Peter Auer,et al.  Weak Hypotheses and Boosting for Generic Object Detection and Recognition , 2004, ECCV.

[17]  Hans Knutsson,et al.  Learning multidimensional signal processing , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).