Hierarchical Multi-label Text Classification with Horizontal and Vertical Category Correlations

Hierarchical multi-label text classification (HMTC) deals with the challenging task where an instance can be assigned to multiple hierarchically structured categories at the same time. The majority of prior studies either focus on reducing the HMTC task into a flat multi-label problem ignoring the vertical category correlations or exploiting the dependencies across different hierarchical levels without considering the horizontal correlations among categories at the same level, which inevitably leads to fundamental information loss. In this paper, we propose a novel HMTC framework that considers both vertical and horizontal category correlations. Specifically, we first design a loosely coupled graph convolutional neural network as the representation extractor to obtain representations for words, documents, and, more importantly, level-wise representations for categories, which are not considered in previous works. Then, the learned category representations are adopted to capture the vertical dependencies among levels of category hierarchy and model the horizontal correlations. Finally, based on the document embeddings and category embeddings, we design a hybrid algorithm to predict the categories of the entire hierarchical structure. Extensive experiments conducted on real-world HMTC datasets validate the effectiveness of the proposed framework with significant improvements over the baselines.

[1]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[2]  Jianxin Li,et al.  Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN , 2018, WWW.

[3]  A. Törcsvári,et al.  Automated categorization in the international patent classification , 2003, SIGF.

[4]  Enhong Chen,et al.  Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach , 2019, CIKM.

[5]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[6]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[7]  Bernard Ghanem,et al.  DeepGCNs: Making GCNs Go as Deep as CNNs , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Alex Alves Freitas,et al.  Comparing Several Approaches for Hierarchical Classification of Proteins with Decision Trees , 2007, BSB.

[9]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[10]  Noam Slonim,et al.  Learning Thematic Similarity Metric from Article Sections Using Triplet Networks , 2018, ACL.

[11]  Houfeng Wang,et al.  Text Level Graph Neural Network for Text Classification , 2019, EMNLP.

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[14]  Linli Xu,et al.  Label Incorporated Graph Neural Networks for Text Classification , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[15]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Reduction strategies for hierarchical multi-label classification in protein function prediction , 2016, BMC Bioinformatics.

[16]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[17]  Rodrigo C. Barros,et al.  Hierarchical Multi-Label Classification Networks , 2018, ICML.

[18]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[19]  Wenwu Zhu,et al.  Deep Learning on Graphs: A Survey , 2018, IEEE Transactions on Knowledge and Data Engineering.

[20]  Ning Ding,et al.  Hierarchy-Aware Global Model for Hierarchical Text Classification , 2020, ACL.

[21]  Yu Meng,et al.  TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names , 2021, NAACL.

[22]  Kostas Tsioutsiouliklis,et al.  Hierarchical Transfer Learning for Multi-label Text Classification , 2019, ACL.