Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document and hence often suffers from poor performance in analyzing short documents. In addition, its parameter estimation often relies on approximate posterior inference that is either not scalable or suffers from large approximation error. This paper introduces a new topic-modeling framework where each document is viewed as a set of word embedding vectors and each topic is modeled as an embedding vector in the same embedding space. Embedding the words and topics in the same vector space, we define a method to measure the semantic difference between the embedding vectors of the words of a document and these of the topics, and optimize the topic embeddings to minimize the expected difference over all documents. Experiments on text analysis demonstrate that the proposed method, which is amenable to mini-batch stochastic gradient descent based optimization and hence scalable to big corpora, provides competitive performance in discovering more coherent and diverse topics and extracting better document representations.

[1]  Zhibin Duan,et al.  TopicNet: Semantic Graph-Guided Topic Discovery , 2021, NeurIPS.

[2]  Xinjie Fan,et al.  A Prototype-Oriented Framework for Unsupervised Domain Adaptation , 2021, NeurIPS.

[3]  Zhibin Duan,et al.  Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network , 2021, ICML.

[4]  Dinh Q. Phung,et al.  Topic Modelling Meets Deep Neural Networks: A Survey , 2021, IJCAI.

[5]  Trung Le,et al.  Neural Topic Model via Optimal Transport , 2020, ICLR.

[6]  Jongwuk Lee,et al.  Decoupled word embeddings using latent topics , 2020, SAC.

[7]  Mingyuan Zhou,et al.  Recurrent Hierarchical Topic-Guided RNN for Language Generation , 2019, ICML.

[8]  Yannis Papanikolaou,et al.  Neural Embedding Allocation: Distributed Representations of Topic Models , 2019, Computational Linguistics.

[9]  David M. Blei,et al.  Topic Modeling in Embedding Spaces , 2019, Transactions of the Association for Computational Linguistics.

[10]  Feng Nan,et al.  Topic Modeling with Wasserstein Autoencoders , 2019, ACL.

[11]  Justin Solomon,et al.  Hierarchical Optimal Transport for Document Representation , 2019, NeurIPS.

[12]  Wei Liu,et al.  Distilled Wasserstein Learning for Word Embedding and Topic Modeling , 2018, NeurIPS.

[13]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[14]  Hao Zhang,et al.  WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling , 2018, ICLR.

[15]  Lan Du,et al.  A Word Embeddings Informed Focused Topic Model , 2017, ACML.

[16]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[17]  Gang Liu,et al.  MetaLDA: A Topic Model that Efficiently Incorporates Meta Information , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[18]  Steven Schockaert,et al.  Jointly Learning Word Embeddings and Latent Topics , 2017, SIGIR.

[19]  Noah A. Smith,et al.  Neural Models for Documents with Metadata , 2017, ACL.

[20]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[21]  Hui Xiong,et al.  Topic Modeling of Short Texts: A Pseudo-Document View , 2016, KDD.

[22]  Aixin Sun,et al.  Topic Modeling for Short Texts with Auxiliary Word Embeddings , 2016, SIGIR.

[23]  Christopher E. Moody,et al.  Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec , 2016, ArXiv.

[24]  Mingyuan Zhou,et al.  Augmentable Gamma Belief Networks , 2015, J. Mach. Learn. Res..

[25]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[26]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[27]  Hossein Mobahi,et al.  Learning with a Wasserstein Loss , 2015, NIPS.

[28]  Dat Quoc Nguyen,et al.  Improving Topic Models with Latent Feature Word Representations , 2015, TACL.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[31]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[32]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[33]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[34]  Paolo Ferragina,et al.  Classification of Short Texts by Deploying Topical Annotations , 2012, ECIR.

[35]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[36]  Alexander J. Smola,et al.  Word Features for Latent Dirichlet Allocation , 2010, NIPS.

[37]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[38]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[39]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[40]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[41]  T. Griffiths,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Bo Chen,et al.  Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network , 2020, NeurIPS.

[43]  Dinh Phung,et al.  OTLDA: A Geometry-aware Optimal Transport Approach for Topic Modeling , 2020, NeurIPS.

[44]  Sophie Burkhardt,et al.  Decoupling Sparsity and Smoothness in the Dirichlet Variational Autoencoder Topic Model , 2019, J. Mach. Learn. Res..

[45]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[46]  Emanuele Della Valle,et al.  An Introduction to Information Retrieval , 2013 .

[47]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2009 .

[48]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .