A Feature-Enhanced Ranking-Based Classifier for Multimodal Data and Heterogeneous Information Networks

We propose a heterogeneous information network mining algorithm: feature-enhanced Rank Class (F-Rank Class). F-Rank Class extends Rank Class to a unified classification framework that can be applied to binary or multiclass classification of unimodal or multimodal data. We experimented on a multimodal document dataset, 2008/9 Wikipedia Selection for Schools. For unimodal classification, F-Rank Class is compared to support vector machines (SVMs). F-Rank Class provides improvements up to 27.3% on the Wikipedia dataset. For multimodal document classification, F-Rank Class shows improvements up to 19.7% in accuracy when compared to SVM-based meta-classifiers. We also study 1) how the structure of the network and 2) how the choice of parameters affect the classification results.

[1]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[2]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[3]  Pierre Moulin,et al.  Meta-classifiers for multimodal document classification , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[4]  Charu C. Aggarwal,et al.  A Survey of Text Classification Algorithms , 2012, Mining Text Data.

[5]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[6]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[7]  Nawei Chen,et al.  A Survey of Indexing and Retrieval of Multimodal Documents: Text and Images , 2006 .

[8]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[9]  Mohammed Maree,et al.  Automatic construction of a domain-independent knowledge base from heterogeneous data sources , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[10]  Sven J. Dickinson,et al.  Learning Categorical Shape from Captioned Images , 2012, 2012 Ninth Conference on Computer and Robot Vision.

[11]  Hagit Shatkay,et al.  Integrating image data into biomedical text categorization , 2006, ISMB.

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.