Hierarchy-aware Label Semantics Matching Network for Hierarchical Text Classification

Hierarchical text classification is an important yet challenging task due to the complex structure of the label hierarchy. Existing methods ignore the semantic relationship between text and labels, so they cannot make full use of the hierarchical information. To this end, we formulate the text-label semantics relationship as a semantic matching problem and thus propose a hierarchy-aware label semantics matching network (HiMatch). First, we project text semantics and label semantics into a joint embedding space. We then introduce a joint embedding loss and a matching learning loss to model the matching relationship between the text semantics and the label semantics. Our model captures the text-label semantics matching relationship among coarse-grained labels and fine-grained labels in a hierarchy-aware manner. The experimental results on various benchmark datasets verify that our model achieves state-of-the-art results.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[3]  Lin Xiao,et al.  Hyperbolic Interaction Model For Hierarchical Multi-Label Classification , 2019, AAAI.

[4]  Ming Liu,et al.  Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs , 2020, EMNLP.

[5]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[6]  Ion Androutsopoulos,et al.  Large-Scale Multi-Label Text Classification on EU Legislation , 2019, ACL.

[7]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[8]  Jiawei Han,et al.  Hierarchical Text Classification with Reinforced Label Assignment , 2019, EMNLP.

[9]  Ning Ding,et al.  Hierarchy-Aware Global Model for Hierarchical Text Classification , 2020, ACL.

[10]  Wei Sun,et al.  Ranking-Based Autoencoder for Extreme Multi-label Classification , 2019, NAACL-HLT.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Yu-Chiang Frank Wang,et al.  Learning Deep Latent Spaces for Multi-Label Classification , 2017, ArXiv.

[13]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[14]  Kostas Tsioutsiouliklis,et al.  Hierarchical Transfer Learning for Multi-label Text Classification , 2019, ACL.

[15]  Lin Xiao,et al.  Label-Specific Document Representation for Multi-Label Text Classification , 2019, EMNLP.

[16]  Philip S. Yu,et al.  Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification , 2019, IEEE Transactions on Knowledge and Data Engineering.

[17]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[18]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.