HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Short text classification has been widely explored in news tagging to provide more efficient search strategies and more effective search results for information retrieval. However, most existing studies, concentrating on long text classification, deliver unsatisfactory performance on short texts due to the sparsity issue and the insufficiency of labeled data. In this article, we propose a novel heterogeneous graph neural network-based method for semi-supervised short text classification, leveraging full advantage of limited labeled data and large unlabeled data through information propagation along the graph. Specifically, we first present a flexible heterogeneous information network (HIN) framework for modeling short texts, which can integrate any type of additional information and meanwhile capture their relations to address the semantic sparsity. Then, we propose Heterogeneous Graph Attention networks (HGAT) to embed the HIN for short text classification based on a dual-level attention mechanism, including node-level and type-level attentions. To efficiently classify new coming texts that do not previously exist in the HIN, we extend our model HGAT for inductive learning, avoiding re-training the model on the evolving HIN. Extensive experiments on single-/multi-label classification demonstrates that our proposed model HGAT significantly outperforms state-of-the-art methods across the benchmark datasets under both transductive and inductive learning.

[1]  Vili Podgorelec,et al.  Improving Short Text Classification using Information from DBpedia Ontology , 2020, Fundam. Informaticae.

[2]  Hsinchun Chen,et al.  A Deep Learning Architecture for Psychometric Natural Language Processing , 2020, ACM Trans. Inf. Syst..

[3]  Luísa Coheur,et al.  From symbolic to sub-symbolic information in question classification , 2011, Artificial Intelligence Review.

[4]  Xiaolin Du,et al.  Short Text Classification: A Survey , 2014, J. Multim..

[5]  Carlotta Domeniconi,et al.  Building semantic kernels for text classification using wikipedia , 2008, KDD.

[6]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[7]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[8]  Jian Xing,et al.  Seed-Guided Topic Model for Document Filtering and Classification , 2018, ACM Trans. Inf. Syst..

[9]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[10]  Abhinav Gupta,et al.  Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Yiming Yang,et al.  An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[12]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[13]  Liqiang Nie,et al.  Large-Scale Question Tagging via Joint Question-Topic Embedding Learning , 2020, ACM Trans. Inf. Syst..

[14]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[15]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[16]  Jiawei Han,et al.  Weakly-Supervised Neural Text Classification , 2018, CIKM.

[17]  Xiangji Huang,et al.  Modeling and Mining Domain Shared Knowledge for Sentiment Analysis , 2017, ACM Trans. Inf. Syst..

[18]  Holger H. Hoos,et al.  A survey on semi-supervised learning , 2019, Machine Learning.

[19]  Michalis Vazirgiannis,et al.  Text Categorization as a Graph Classification Problem , 2015, ACL.

[20]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[21]  Aixin Sun,et al.  Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings , 2017, ACM Trans. Inf. Syst..

[22]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[23]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[24]  Alneu de Andrade Lopes,et al.  Optimization and label propagation in bipartite heterogeneous networks to improve transductive classification of texts , 2016, Inf. Process. Manag..