Label-Embedding Bi-directional Attentive Model for Multi-label Text Classification

Multi-label text classification is a critical task in natural language processing field. As the latest language representation model, BERT obtains new state-of-the-art results in the classification task. Nevertheless, the text classification framework of BERT neglects to make full use of the token-level text representation and label embedding, since it only utilizes the final hidden state corresponding to CLS token as sequence-level text representation for classification. We assume that the finer-grained token-level text representation and label embedding contribute to classification. Consequently, in this paper, we propose a Label-Embedding Bi-directional Attentive model to improve the performance of BERT’s text classification framework. In particular, we extend BERT’s text classification framework with label embedding and bi-directional attention. Experimental results on the five datasets indicate that our model has notable improvements over both baselines and state-of-the-art models.

[1]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[2]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[3]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[4]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[6]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..