Deep Learning Method with Attention for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC), the problem of finding the most relevant label subset of each document from hundreds or even millions labels, has been a practical and important problem since the boom of big data. Significant progress has been made in recent years by the development of machine learning methods. However, although deep learning method has beaten traditional method in other related areas, it has no clear advantage in XMTC when we consider the performance of prediction. In order to improve the performance of deep learning method for Extreme multi-label text classification, we propose a novel feature extraction method to better explore the text space. Specifically, we build the model consisting of attention mechanism, convolutional neural network and recurrent neural network to extract multi-view features. Extensive experiments on four public available datasets show that our method achieves better performance than several strong baselines, including traditional methods and deep learning methods.

[1]  Hongyuan Zha,et al.  Deep Extreme Multi-label Learning , 2017, ICMR.

[2]  Mohamed Ben Ahmed,et al.  Deep Neural Networks and Decision Tree Classifier for Visual Question Answering in the Medical Domain , 2018, CLEF.

[3]  Krishnakumar Balasubramanian,et al.  The Landmark Selection Method for Multiple Output Prediction , 2012, ICML.

[4]  Yiming Yang,et al.  Deep Learning for Extreme Multi-label Text Classification , 2017, SIGIR.

[5]  John Langford,et al.  Multi-Label Prediction via Compressed Sensing , 2009, NIPS.

[6]  Pradeep Ravikumar,et al.  PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification , 2016, ICML.

[7]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[8]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[9]  Mahmood Yousefi-Azar,et al.  Text summarization using unsupervised deep learning , 2017, Expert Syst. Appl..

[10]  Jihong Ouyang,et al.  Supervised topic models with weighted words: multi-label document classification , 2018, Frontiers of Information Technology & Electronic Engineering.

[11]  Johannes Fürnkranz,et al.  Large-Scale Multi-label Text Classification - Revisiting Neural Networks , 2013, ECML/PKDD.

[12]  Thang Nguyen,et al.  Semantic sentence embeddings for paraphrasing and text summarization , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[13]  Terence R. F. Nonweiler SLEEC: a space station ambulance , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Moustapha Cissé,et al.  Robust Bloom Filters for Large MultiLabel Classification Tasks , 2013, NIPS.

[16]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[17]  Hao Wang,et al.  Multi-label relational classification via node and label correlation , 2018, Neurocomputing.

[18]  Manik Varma,et al.  FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning , 2014, KDD.

[19]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.