Multi-label text classification is a critical task in natural language processing field. As the latest language representation model, BERT obtains new state-of-the-art results in the classification task. Nevertheless, the text classification framework of BERT neglects to make full use of the token-level text representation and label embedding, since it only utilizes the final hidden state corresponding to CLS token as sequence-level text representation for classification. We assume that the finer-grained token-level text representation and label embedding contribute to classification. Consequently, in this paper, we propose a Label-Embedding Bi-directional Attentive model to improve the performance of BERT’s text classification framework. In particular, we extend BERT’s text classification framework with label embedding and bi-directional attention. Experimental results on the five datasets indicate that our model has notable improvements over both baselines and state-of-the-art models.
[1]
Geoff Holmes,et al.
Classifier chains for multi-label classification
,
2009,
Machine Learning.
[2]
Jiebo Luo,et al.
Learning multi-label scene classification
,
2004,
Pattern Recognit..
[3]
Grigorios Tsoumakas,et al.
Multi-Label Classification: An Overview
,
2007,
Int. J. Data Warehous. Min..
[4]
Zhi-Hua Zhou,et al.
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization
,
2006,
IEEE Transactions on Knowledge and Data Engineering.
[5]
Amanda Clare,et al.
Knowledge Discovery in Multi-label Phenotype Data
,
2001,
PKDD.
[6]
Zhi-Hua Zhou,et al.
ML-KNN: A lazy learning approach to multi-label learning
,
2007,
Pattern Recognit..