Heterogeneous Transfer Learning for Image Clustering via the SocialWeb

In this paper, we present a new learning scenario, heterogeneous transfer learning, which improves learning performance when the data can be in different feature spaces and where no correspondence between data instances in these spaces is provided. In the past, we have classified Chinese text documents using English training data under the heterogeneous transfer learning framework. In this paper, we present image clustering as an example to illustrate how unsupervised learning can be improved by transferring knowledge from auxiliary heterogeneous data obtained from the social Web. Image clustering is useful for image sense disambiguation in query-based image search, but its quality is often low due to imagedata sparsity problem. We extend PLSA to help transfer the knowledge from social Web data, which have mixed feature representations. Experiments on image-object clustering and scene clustering tasks show that our approach in heterogeneous transfer learning based on the auxiliary data is indeed effective and promising.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[3]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[4]  Pedro M. Domingos,et al.  Deep transfer via second-order Markov logic , 2009, ICML '09.

[5]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[6]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[7]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[8]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[9]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[10]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[12]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[13]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[16]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[17]  Udo Hahn,et al.  Multi-Task Active Learning for Linguistic Annotations , 2008, ACL.

[18]  Qiang Yang,et al.  Can chinese web pages be classified with english data source? , 2008, WWW.

[19]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[21]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[22]  Ramesh Nallapati,et al.  Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition , 2008, ACL.

[23]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[24]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[25]  Ari Rappoport,et al.  Self-Training for Enhancement and Domain Adaptation of Statistical Parsers Trained on Small Datasets , 2007, ACL.

[26]  David A. Forsyth,et al.  Discriminating Image Senses by Clustering with Multimodal Features , 2006, ACL.

[27]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[28]  Hwee Tou Ng,et al.  Domain Adaptation with Active Learning for Word Sense Disambiguation , 2007, ACL.

[29]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[30]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[31]  Thorsten Joachims,et al.  Supervised clustering with support vector machines , 2005, ICML.

[32]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[33]  Sabine Bergler,et al.  When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[34]  Douglas W. Oard,et al.  Bilingual topic aspect classification with a few training examples , 2008, SIGIR '08.

[35]  Qiang Yang,et al.  Self-taught clustering , 2008, ICML '08.

[36]  Ramesh Nallapati,et al.  A Comparative Study of Methods for Transductive Transfer Learning , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[37]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[38]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[39]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[40]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[41]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[42]  Thomas G. Dietterich,et al.  Improving SVM accuracy by training on auxiliary data sources , 2004, ICML.