Feature Relationships Hypergraph for Multimodal Recognition

Utilizing multimodal features to describe multimedia data is a natural way for accurate pattern recognition. However, how to deal with the complex relationships caused by the tremendous multimodal features and the curse of dimensionality are still two crucial challenges. To solve the two problems, a new multimodal features integration method is proposed. Firstly, a so-called Feature Relationships Hypergraph (FRH) is proposed to model the high-order correlations among the multimodal features. Then, based on FRH, the multimodal features are clustered into a set of low-dimensional partitions. And two types of matrices, the inter-partition matrix and intra-partition matrix, are computed to quantify the inter- and intra- partition relationships. Finally, a multi-class boosting strategy is developed to obtain a strong classifier by combining the weak classifiers learned from the intra- partition matrices. The experimental results on different datasets validate the effectiveness of our approach.

[1]  Kun Zhou,et al.  Locality Sensitive Discriminant Analysis , 2007, IJCAI.

[2]  Zhaohui Sun Adaptation for multiple cue integration , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[3]  Edward Y. Chang,et al.  Optimal multimodal fusion for multimedia data analysis , 2004, MULTIMEDIA '04.

[4]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  Xiaoli Zhou,et al.  Feature fusion of side face and gait for video-based human identification , 2008, Pattern Recognit..

[7]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[8]  C. Schmid,et al.  Scale-invariant shape features for recognition of object categories , 2004, CVPR 2004.

[9]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[10]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[11]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[12]  Xiaoli Zhou,et al.  Integrating Face and Gait for Human Recognition , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[13]  S. V. N. Vishwanathan,et al.  Multiple Kernel Learning and the SMO Algorithm , 2010, NIPS.

[14]  Chun Chen,et al.  Feature selection for fast speech emotion recognition , 2009, ACM Multimedia.

[15]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Philip S. Yu,et al.  Community Learning by Graph Approximation , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[17]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[18]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[19]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines , 2003, Lecture Notes in Computer Science.

[20]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[21]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.