Contrastive Learning for Neural Topic Model

Recent empirical studies show that adversarial topic models (ATM) can successfully capture semantic patterns of the document by differentiating a document with another dissimilar sample. However, utilizing that discriminative-generative architecture has two important drawbacks: (1) the architecture does not relate similar documents, which has the same document-word distribution of salient words; (2) it restricts the ability to integrate external information, such as sentiments of the document, which has been shown to benefit the training of neural topic model. To address those issues, we revisit the adversarial topic architecture in the viewpoint of mathematical analysis, propose a novel approach to re-formulate discriminative goal as an optimization problem, and design a novel sampling method which facilitates the integration of external variables. The reformulation encourages the model to incorporate the relations among similar samples and enforces the constraint on the similarity among dissimilar ones; while the sampling method, which is based on the internal input and reconstructed output, helps inform the model of salient words contributing to the main topic. Experimental results show that our framework outperforms other state-of-the-art neural topic models in three common benchmark datasets that belong to various domains, vocabulary sizes, and document lengths in terms of topic coherence.

[1]  Geoffrey E. Hinton,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[2]  Pushmeet Kohli,et al.  Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[4]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[5]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.

[6]  Feng Nan,et al.  Topic Modeling with Wasserstein Autoencoders , 2019, ACL.

[7]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[8]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[9]  Zhenguo Li,et al.  DetCo: Unsupervised Contrastive Learning for Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[11]  Lars Petersson,et al.  Dual Contrastive Learning for Unsupervised Image-to-Image Translation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Noah A. Smith,et al.  Neural Models for Documents with Metadata , 2017, ACL.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Sandhya Subramani,et al.  A Novel Approach of Neural Topic Modelling for Document Clustering , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[15]  Honglak Lee,et al.  An efficient framework for learning sentence representations , 2018, ICLR.

[16]  Cordelia Schmid,et al.  What makes for good views for contrastive learning , 2020, NeurIPS.

[17]  Ching-Yao Chuang,et al.  Contrastive Learning with Hard Negative Samples , 2020, ArXiv.

[18]  Phil Blunsom,et al.  Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.

[19]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[20]  Deyu Zhou,et al.  Neural Topic Modeling with Cycle-Consistent Adversarial Training , 2020, EMNLP.

[21]  Kaveh Hassani,et al.  Contrastive Multi-View Representation Learning on Graphs , 2020, ICML.

[22]  Chi Zhang,et al.  FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  How Pandemic Spread in News: Text Analysis Using Topic Model , 2020, 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT).

[24]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[25]  Dinh Phung,et al.  OTLDA: A Geometry-aware Optimal Transport Approach for Topic Modeling , 2020, NeurIPS.

[26]  Aditya Prasad,et al.  Unsupervised Hard Example Mining from Videos for Improved Object Detection , 2018, ECCV.

[27]  Philip Resnik,et al.  Improving Neural Topic Models Using Knowledge Distillation , 2020, EMNLP.

[28]  Ali Razavi,et al.  Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.

[29]  Ching-Yao Chuang,et al.  Debiased Contrastive Learning , 2020, NeurIPS.

[30]  Rui Wang,et al.  Open Event Extraction from Online Text using a Generative Adversarial Network , 2019, EMNLP.

[31]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[32]  W. Karush Minima of Functions of Several Variables with Inequalities as Side Conditions , 2014 .

[33]  Jian Tang,et al.  InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization , 2019, ICLR.

[34]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[35]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[36]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[37]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Andrew K. C. Wong,et al.  Entropy and Distance of Random Graphs with Application to Structural Pattern Recognition , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Rui Wang,et al.  ATM: Adversarial-neural Topic Model , 2018, Inf. Process. Manag..

[40]  Alexander M. Bronstein,et al.  Learning to Detect and Retrieve Objects From Unlabeled Videos , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[41]  Deyu Zhou,et al.  Neural Topic Modeling with Bidirectional Adversarial Training , 2020, ACL.

[42]  Ender Konukoglu,et al.  Contrastive learning of global and local features for medical image segmentation with limited annotations , 2020, NeurIPS.

[43]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[44]  Tho Quan,et al.  Enriching and Controlling Global Semantics for Text Summarization , 2021, EMNLP.

[45]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[46]  Sung Ju Hwang,et al.  Adversarial Self-Supervised Contrastive Learning , 2020, NeurIPS.

[47]  Regina Barzilay,et al.  Capturing Greater Context for Question Generation , 2019, AAAI.

[48]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[49]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[50]  Tsung-Wei Ke,et al.  Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning , 2021, ICLR.

[51]  Zhangyang Wang,et al.  Graph Contrastive Learning with Augmentations , 2020, NeurIPS.

[52]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[53]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[54]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[55]  Ramesh Nallapati,et al.  Coherence-Aware Neural Topic Modeling , 2018, EMNLP.

[56]  Nuno Vasconcelos,et al.  Contrastive Learning with Adversarial Examples , 2020, NeurIPS.