Discriminative Topic Modeling Based on Manifold Learning

Topic modeling has been popularly used for data analysis in various domains including text documents. Previous topic models, such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA), have shown impressive success in discovering low-rank hidden structures for modeling text documents. These models, however, do not take into account the manifold structure of data, which is generally informative for the non-linear dimensionality reduction mapping. More recent models, namely Laplacian PLSI (LapPLSI) and Locally-consistent Topic Model (LTM), have incorporated the local manifold structure into topic models and have shown the resulting benefits. But these approaches fall short of the full discriminating power of manifold learning as they only enhance the proximity between the low-rank representations of neighboring pairs without any consideration for non-neighboring pairs. In this paper, we propose Discriminative Topic Model (DTM) that separates non-neighboring pairs from each other in addition to bringing neighboring pairs closer together, thereby preserving the global manifold structure as well as improving the local consistency. We also present a novel model fitting algorithm based on the generalized EM and the concept of Pareto improvement. As a result, DTM achieves higher classification performance in a semi-supervised setting by effectively exposing the manifold structure of data. We provide empirical evidence on text corpora to demonstrate the success of DTM in terms of classification accuracy and robustness to parameters compared to state-of-the-art techniques.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[3]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[4]  A. Atkinson The Economics of the Welfare State , 1996 .

[5]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[8]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[9]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[10]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming in Support Vector Machines , 2002, NIPS.

[11]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[12]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[13]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[14]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Daniel Boley,et al.  Principal Direction Divisive Partitioning , 1998, Data Mining and Knowledge Discovery.

[17]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[18]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[19]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[21]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[24]  Carey E. Priebe,et al.  Semisupervised learning from dissimilarity data , 2008, Comput. Stat. Data Anal..

[25]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[26]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[27]  Deng Cai,et al.  Probabilistic dyadic data analysis with local and global consistency , 2009, ICML '09.

[28]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[29]  Stephen E. Fienberg,et al.  Discriminative Topic Modeling Based on Manifold Learning , 2012, ACM Trans. Knowl. Discov. Data.

[30]  U. Feige,et al.  Spectral Graph Theory , 2015 .