Hierarchical Hybrid Attention Networks for Chinese Conversation Topic Classification

Topic classification is useful for applications such as forensics analysis and cyber-crime investigation. To improve the overall performance on the task of Chinese conversation topic classification, we propose a hierarchical neural network with automatic semantic features selection, which is a hierarchical architecture that depicts the structure of conversations. The model firstly incorporates speaker information into the character- and word-level attentions and generates sentence representation, then uses attention-based BLSTM to construct the conversation representation. Experimental results on three datasets demonstrate that our model achieves better performance than multiple baselines. It indicates that the proposed architecture can capture the informative and salient features related to the meaning of a conversation for topic classification. And we release the dataset of this paper that can be obtained from https://github.com/njoe9/H-HANs.

[1]  Alok N. Choudhary,et al.  Twitter Trending Topic Classification , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[2]  Tianrui Li,et al.  Incremental updating of rough approximations in interval-valued information systems under attribute generalization , 2016, Inf. Sci..

[3]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[4]  Zhiyuan Liu,et al.  Neural Sentiment Classification with User and Product Attention , 2016, EMNLP.

[5]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[7]  Angela Orebaugh,et al.  Classification of Instant Messaging Communications for Forensics Analysis , 2009 .

[8]  John G. Breslin,et al.  Topic Classification in Social Media Using Metadata from Hyperlinked Objects , 2011, ECIR.

[9]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[10]  Ramlan Mahmod,et al.  A systematic literature review for topic detection in chat conversation for cyber-­crime investigation , 2014 .

[11]  Yong Zhang,et al.  Attention pooling-based convolutional neural network for sentence modelling , 2016, Inf. Sci..

[12]  Bing Liu,et al.  Social Media Text Classification under Negative Covariate Shift , 2015, EMNLP.

[13]  Xiaoming Zhang,et al.  A Semi-Supervised Bayesian Network Model for Microblog Topic Classification , 2012, COLING.

[14]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[17]  Changliang Li,et al.  Compositional Recurrent Neural Networks for Chinese Short Text Classification , 2016, 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[18]  Ramón Fernández Astudillo,et al.  From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.

[19]  Denilson Barbosa,et al.  Topic Classification of Blog Posts Using Distant Supervision , 2012 .