MsCoa: Multi-Step Co-Attention Model for Multi-Label Classification

Multi-label text classification (MLC) task, as one of the sub-tasks of natural language processing, has broad application prospects. On the basis of studying the previous research work, this research takes the relationship among text information, leading label information and predictive label information as the frame and analyzes the information loss of original text and leading label, decoding error accumulation. We propose an improved multi-step multi-classification model to mitigate the phenomenon of error prediction, label repetition and error accumulation. The model uses multi-step and multi-classification task to complete multi-label prediction. It uses the leading label and the original text as input, and the next to-be-predicted label as output. The co-attention mechanism is operated between the original text and the leading label. The attention of the original text to the leading label is helpful to filter out the error accumulation problem caused by the error prediction. The features is combined in a manner of difference and concatenation, which highlights the auxiliary effect of the model structure on feature extraction. In order to avoid the influence of the feature dimension on the performance of long short-term memory (LSTM), a multi-layer fully-connected classifier is used instead to predict the label. Through experimental validation, the performance of our model on the multi-label text classification task shows the current optimal level, which fully proves the superiority of our model.

[1]  Wojciech Zaremba,et al.  Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[2]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[3]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[4]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[5]  Weiwei Liu,et al.  Large Margin Metric Learning for Multi-Label Prediction , 2015, AAAI.

[6]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[7]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Björn W. Schuller,et al.  Bidirectional LSTM Networks for Context-Sensitive Keyword Detection in a Cognitive Virtual Agent Framework , 2010, Cognitive Computation.

[10]  Xu Sun,et al.  Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification , 2018, EMNLP.

[11]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[12]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[13]  Qiang Wu,et al.  Coupled Patch Alignment for Matching Cross-View Gaits , 2019, IEEE Transactions on Image Processing.

[14]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[15]  Zhenchang Xing,et al.  Ensemble application of convolutional and recurrent neural networks for multi-label text categorization , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[16]  Jürgen Schmidhuber,et al.  Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition , 2005, ICANN.

[17]  Peng Zhang,et al.  A general tensor representation framework for cross-view gait recognition , 2019, Pattern Recognit..

[18]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[19]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[20]  Johannes Fürnkranz,et al.  Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification , 2017, NIPS.

[21]  Xindong Wu,et al.  Multi-Instance Learning with Discriminative Bag Mapping , 2018, IEEE Transactions on Knowledge and Data Engineering.

[22]  Francesco Piazza,et al.  Preprocessing based solution for the vanishing gradient problem in recurrent neural networks , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Weiwei Liu,et al.  An Easy-to-hard Learning Paradigm for Multiple Classes and Multiple Labels , 2017, J. Mach. Learn. Res..

[25]  Qiang Wu,et al.  Coupled Bilinear Discriminant Projection for Cross-View Gait Recognition , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Weiwei Liu,et al.  Metric Learning for Multi-Output Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[28]  Philip S. Yu,et al.  Joint Structure Feature Exploration and Regularization for Multi-Task Graph Classification , 2016, IEEE Transactions on Knowledge and Data Engineering.

[29]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[30]  Jun Jiao,et al.  Multi-instance multi-label learning for automatic tag recommendation , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[31]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[32]  Muhammad Abulaish,et al.  Multi-Label Classification of Microblogging Texts Using Convolution Neural Network , 2019, IEEE Access.

[33]  Xu Sun,et al.  Global Encoding for Abstractive Summarization , 2018, ACL.

[34]  Yiming Yang,et al.  Multilabel classification with meta-level features , 2010, SIGIR.

[35]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[36]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[37]  Weiwei Liu,et al.  On the Optimality of Classifier Chain for Multi-label Classification , 2015, NIPS.

[38]  Xu Sun,et al.  Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions , 2018, ArXiv.