Factorized Multi-Modal Topic Model

Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data, multiple variants of topic models attempting to tie the modalities together have been presented. All of these, however, lack the ability to learn components private to one modality, and consequently will try to force dependencies even between minimally correlating modalities. In this work we combine the two approaches by presenting a novel HDP-based topic model that automatically learns both shared and private topics. The model is shown to be especially useful for querying the contents of one domain given samples of the other.

[1]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[4]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[5]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[6]  Samuel Kaski,et al.  Probabilistic approach to detecting dependencies between data sets , 2008, Neurocomputing.

[7]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[8]  Hagai Attias,et al.  Independent factor topic models , 2009, ICML '09.

[9]  Vasant Honavar,et al.  Multi-Modal Hierarchical Dirichlet Process Model for Predicting Image Annotation and Image-Object Label Correspondence , 2009, SDM.

[10]  Yiming Yang,et al.  Multi-field Correlated Topic Modeling , 2009, SDM.

[11]  Trevor Darrell,et al.  Factorized Latent Spaces with Structured Sparsity , 2010, NIPS.

[12]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[13]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Chong Wang,et al.  The Discrete Infinite Logistic Normal Distribution for Mixed-Membership Modeling , 2011, AISTATS.

[15]  Samuel Kaski,et al.  Bayesian CCA via Group Sparsity , 2011, ICML.

[16]  Thore Graepel,et al.  Kernel Topic Models , 2011, AISTATS.