Regularized Graph Convolutional Networks for Short Text Classification

Short text classification is a fundamental problem in natural language processing, social network analysis, and e-commerce. The lack of structure in short text sequences limits the success of popular NLP methods based on deep learning. Simpler methods that rely on bag-of-words representations tend to perform on par with complex deep learning methods. To tackle the limitations of textual features in short text, we propose a Graph-regularized Graph Convolution Network (GR-GCN), which augments graph convolution networks by incorporating label dependencies in the output space. Our model achieves state-of-the-art results on both proprietary and external datasets, outperforming several baseline methods by up to 6% . Furthermore, we show that compared to baseline methods, GR-GCN is more robust to noise in textual features.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  Ying Li,et al.  Product query classification , 2009, CIKM.

[3]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[4]  Venu Govindaraju,et al.  Cognitive-Biometric Recognition From Language Usage: A Feasibility Study , 2017, IEEE Transactions on Information Forensics and Security.

[5]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[6]  Guoyin Wang,et al.  Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.

[7]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[8]  Zhichao Yang,et al.  Word Embedding Perturbation for Sentence Classification , 2018, ArXiv.

[9]  Jugal Kalita,et al.  Classifying Short Text in Social Media: Twitter as Case Study , 2015 .

[10]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[11]  Guillermo Sapiro,et al.  Kernelized Probabilistic Matrix Factorization: Exploiting Graphs and Side Information , 2012, SDM.

[12]  Karthik Subbian,et al.  Short Text Classification using Graph Convolutional Network , 2019 .

[13]  Tong Zhang,et al.  On the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning , 2007, J. Mach. Learn. Res..

[14]  Hang Li,et al.  An Information Retrieval Approach to Short Text Conversation , 2014, ArXiv.

[15]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[16]  K. Bretonnel Cohen,et al.  A shared task involving multi-label classification of clinical free text , 2007, BioNLP@ACL.

[17]  Julian J. McAuley,et al.  Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering , 2016, WWW.

[18]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[19]  Chia-Hua Ho,et al.  Product Title Classification versus Text Classification , 2012 .

[20]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[21]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[22]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[23]  Pradeep Ravikumar,et al.  Collaborative Filtering with Graph Information: Consistency and Scalable Methods , 2015, NIPS.

[24]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .