A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification

The multi-label text classification task aims to tag a document with a series of labels. Previous studies usually treated labels as symbols without semantics and ignored the relation among labels, which caused information loss. In this paper, we show that explicitly modeling label semantics can improve multi-label text classification. We propose a hybrid neural network model to simultaneously take advantage of both label semantics and fine-grained text information. Specifically, we utilize the pre-trained BERT model to compute context-aware representation of documents. Furthermore, we incorporate the label semantics in two stages. First, a novel label graph construction approach is proposed to capture the label structures and correlations. Second, we propose a neoteric attention mechanism—adjustive attention to establish the semantic connections between labels and words and to obtain the label-specific word representation. The hybrid representation that combines context-aware feature and label-special word feature is fed into a document encoder to classify. Experimental results on two publicly available datasets show that our model is superior to other state-of-the-art classification methods.

[1]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[6]  Yannis Papanikolaou,et al.  Ensemble Approaches for Large-Scale Multi-Label Classification and Question Answering in Biomedicine , 2014, CLEF.

[7]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[8]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[9]  Sebastián Ventura,et al.  Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context , 2015, Neurocomputing.

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  Baoxin Wang,et al.  Disconnected Recurrent Neural Networks for Text Categorization , 2018, ACL.

[12]  Pengtao Xie,et al.  A Neural Architecture for Automated ICD Coding , 2017, ACL.

[13]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Jie Li,et al.  Cross-Domain Sentiment Classification With Bidirectional Contextualized Transformer Language Models , 2019, IEEE Access.

[15]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[16]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[17]  Jing Zhang,et al.  Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification , 2019, WISE.

[18]  Hongyuan Zha,et al.  Deep Extreme Multi-label Learning , 2017, ICMR.

[19]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[20]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[21]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[22]  Ruiyun Yu,et al.  Multi-label classification methods for green computing and application for mobile medical recommendations , 2016, IEEE Access.

[23]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[24]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[25]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[26]  Xiaobing Sun,et al.  Understanding Attention for Text Classification , 2020, ACL.

[27]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[28]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Jimmy J. Lin,et al.  DocBERT: BERT for Document Classification , 2019, ArXiv.

[31]  Richard D Riley,et al.  Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal , 2020 .

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[34]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[35]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[36]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[37]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[38]  Xiu-Shen Wei,et al.  Multi-Label Image Recognition With Graph Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[40]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[41]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[42]  Zhenchang Xing,et al.  Ensemble application of convolutional and recurrent neural networks for multi-label text categorization , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[43]  Yue Zhang,et al.  Hierarchically-Refined Label Attention Network for Sequence Labeling , 2019, EMNLP.