CatE: Category-Name GuidedWord Embedding

Unsupervised word embedding has benefited a wide spectrum of NLP tasks due to its effectiveness of encoding word semantics in distributed word representations. However, unsupervised word embedding is a generic representation, not optimized for specific tasks. In this work, we propose a weakly-supervised word embedding framework, CatE. It uses category names to guide word embedding and effectively selects category representative words to regularize the embedding space where the categories are well separated. Experiments show that our model outperforms unsupervised word embedding models significantly on both document classification and category representative words retrieval tasks.

[1]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[2]  Aidong Zhang,et al.  Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts , 2017, KDD.

[3]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[4]  Yu Meng,et al.  Guiding Corpus-based Set Expansion by Auxiliary Sets Generation and Co-Expansion , 2020, WWW.

[5]  Yu Meng,et al.  HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[6]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[7]  Rajarshi Das,et al.  Gaussian LDA for Topic Models with Word Embeddings , 2015, ACL.

[8]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[9]  David M. Blei,et al.  Topic Modeling in Embedding Spaces , 2019, Transactions of the Association for Computational Linguistics.

[10]  Andrew M. Dai,et al.  Embedding Text in Hyperbolic Spaces , 2018, TextGraphs@NAACL-HLT.

[11]  Xiaojin Zhu,et al.  Latent Dirichlet Allocation with Topic-in-Set Knowledge , 2009, HLT-NAACL 2009.

[12]  Mark Dredze,et al.  Improving Lexical Embeddings with Semantic Knowledge , 2014, ACL.

[13]  Hinrich Schütze,et al.  Ultradense Word Embeddings by Orthogonal Transformation , 2016, NAACL.

[14]  Padhraic Smyth,et al.  Combining concept hierarchies and statistical topic models , 2008, CIKM '08.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  David J. Weir,et al.  Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[18]  Roland Kuhn,et al.  Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[19]  Jiawei Han,et al.  Weakly-Supervised Neural Text Classification , 2018, CIKM.

[20]  Qin Lu,et al.  Chasing Hypernyms in Vector Spaces with Entropy , 2014, EACL.

[21]  Hinrich Schütze,et al.  Word Embedding Calculus in Meaningful Ultradense Subspaces , 2016, ACL.

[22]  I. Dhillon,et al.  Matrix nearness problems in data mining , 2007 .

[23]  Guoyin Wang,et al.  Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.

[24]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[25]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[26]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[27]  Nematollah Batmanghelich,et al.  Nonparametric Spherical Topic Modeling with Word Embeddings , 2016, ACL.

[28]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[29]  Ming-Wei Chang,et al.  Importance of Semantic Representation: Dataless Classification , 2008, AAAI.

[30]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[31]  Clare R. Voss,et al.  Scalable Topical Phrase Mining from Text Corpora , 2014, Proc. VLDB Endow..

[32]  Gary Bécigneul,et al.  Poincaré GloVe: Hyperbolic Word Embeddings , 2018, ICLR.

[33]  Douwe Kiela,et al.  Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry , 2018, ICML.

[34]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[35]  Hongfei Yan,et al.  SSHLDA: A Semi-Supervised Hierarchical Topic Model , 2012, EMNLP.

[36]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[37]  Gao Jing,et al.  Topic Discovery for Short Texts Using Word Embeddings , 2016 .

[38]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[39]  Yu Meng,et al.  Spherical Text Embedding , 2019, NeurIPS.

[40]  Jiawei Han,et al.  Weakly-Supervised Hierarchical Text Classification , 2018, AAAI.

[41]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[42]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[43]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[44]  Timothy Baldwin,et al.  Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality , 2014, EACL.

[45]  Hal Daumé,et al.  Incorporating Lexical Priors into Topic Models , 2012, EACL.

[46]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[47]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[48]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[49]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[50]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[51]  Ji-Rong Wen,et al.  WWW 2007 / Track: Search Session: Personalization A Largescale Evaluation and Analysis of Personalized Search Strategies ABSTRACT , 2022 .

[52]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[53]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[54]  Greg Ver Steeg,et al.  Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge , 2016, TACL.

[55]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[56]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[57]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[58]  Wei Liu,et al.  Distilled Wasserstein Learning for Word Embedding and Topic Modeling , 2018, NeurIPS.

[59]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[60]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.

[61]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[62]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[63]  Dat Quoc Nguyen,et al.  Improving Topic Models with Latent Feature Word Representations , 2015, TACL.

[64]  Alessandro Lenci,et al.  How we BLESSed distributional semantic evaluation , 2011, GEMS.

[65]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[66]  Brian M. Sadler,et al.  TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering , 2018, KDD 2018.

[67]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[68]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[69]  Dan Roth,et al.  On Dataless Hierarchical Text Classification , 2014, AAAI.