Modeling Tag Dependencies in Tagged Documents

We present a general approach for modeling tagged documents with topic models. This approach extends related topic models by exploiting the dependencies between tags. We show how this model improves performance in a prediction task where the goal is to predict missing tags for new documents. Predictions also compare favorably with SVMs.

[1]  Shenghuo Zhu,et al.  Empirical Studies on Multi-label Classification , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[2]  Naonori Ueda,et al.  Parametric Mixture Models for Multi-Labeled Text , 2002, NIPS.

[3]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[4]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[5]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[6]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[7]  Andrew McCallum,et al.  Collective multi-label classification , 2005, CIKM '05.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[10]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.