Supervised Nonparametric Multimodal Topic Modeling Methods for Multi-class Video Classification

Nonparametric topic models such as hierarchical Dirichlet processes (HDP) have been attracting more and more attentions for multimedia data analysis. However, the existing models for multimedia data are unsupervised ones that purely cluster semantically or characteristically related features into a specific latent topic without considering side information such as class information. In this paper, we present a novel supervised sequential symmetric correspondence HDP (Sup-SSC-HDP) model for multi-class video classification, where the empirical topic frequencies learned from multimodal video data are modeled as a predictor of video class. Qualitative and quantitative assessments demonstrate the effectiveness of Sup-SSC-HDP.

[1]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[2]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[3]  Michael I. Jordan,et al.  Bayesian Nonparametrics: Hierarchical Bayesian nonparametric models with applications , 2010 .

[4]  Michael I. Jordan,et al.  Hierarchical Bayesian Nonparametric Models with Applications , 2008 .

[5]  Martha Larson,et al.  Overview of MediaEval 2011 Rich Speech Retrieval Task and Genre Tagging Task , 2011, MediaEval.

[6]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[7]  Eric P. Xing,et al.  Symmetric Correspondence Topic Models for Multilingual Text Analysis , 2012, NIPS.

[8]  Koji Eguchi,et al.  Sequential Correspondence Hierarchical Dirichlet Processes for Video Data Analysis , 2016, ICMR.

[9]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[10]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[11]  Luc Van Gool,et al.  What's going on? Discovering spatio-temporal dependencies in dynamic scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[13]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[14]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Andrew M. Dai,et al.  The Supervised Hierarchical Dirichlet Process , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.