TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names

Hierarchical multi-label text classification (HMTC) aims to tag each document with a set of classes from a taxonomic class hierarchy. Most existing HMTC methods train classifiers using massive human-labeled documents, which are often too costly to obtain in real-world applications. In this paper, we explore to conduct HMTC based on only class surface names as supervision signals. We observe that to perform HMTC, human experts typically first pinpoint a few most essential classes for the document as its “core classes”, and then check core classes’ ancestor classes to ensure the coverage. To mimic human experts, we propose a novel HMTC framework, named TaxoClass. Specifically, TaxoClass (1) calculates document-class similarities using a textual entailment model, (2) identifies a document’s core classes and utilizes confident core classes to train a taxonomy-enhanced classifier, and (3) generalizes the classifier via multi-label self-training. Our experiments on two challenging datasets show TaxoClass can achieve around 0.71 Example-F1 using only class names, outperforming the best previous method by 25%.

[1]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Karl Stratos,et al.  Natcat: Weakly Supervised Text Classification with Naturally Annotated Datasets , 2020, ArXiv.

[4]  Sylvain Goumy,et al.  Ecommerce Product Title Classification , 2018, eCOM@SIGIR.

[5]  Xifeng Yan,et al.  HierCon: Hierarchical Organization of Technical Documents Based on Concepts , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[6]  Jingbo Shang,et al.  Contextualized Weak Supervision for Text Classification , 2020, ACL.

[7]  Enhong Chen,et al.  Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach , 2019, CIKM.

[8]  Chi Wang,et al.  TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network , 2020, WWW.

[9]  Noah A. Smith,et al.  Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[10]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[11]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[12]  Xin Liu,et al.  Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification , 2019, WWW.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Jiawei Han,et al.  Weakly-Supervised Hierarchical Text Classification , 2018, AAAI.

[15]  Xifeng Yan,et al.  Unsupervised Neural Categorization for Scientific Publications , 2018, SDM.

[16]  Dan Roth,et al.  Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach , 2019, EMNLP.

[17]  Wei-Cheng Chang,et al.  X-BERT: eXtreme Multi-label Text Classification with BERT , 2019, 1905.02331.

[18]  Ramakanth Kavuluru,et al.  Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces , 2018, EMNLP.

[19]  Zihan Wang,et al.  X-Class: Text Classification with Extremely Weak Supervision , 2020, ArXiv.

[20]  Denilson Barbosa,et al.  Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss , 2018, NAACL.

[21]  Ali Mousavi,et al.  Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces , 2019, NeurIPS.

[22]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[23]  Kostas Tsioutsiouliklis,et al.  Hierarchical Transfer Learning for Multi-label Text Classification , 2019, ACL.

[24]  Manik Varma,et al.  Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications , 2016, KDD.

[25]  Xin Liu,et al.  A Variational Approach to Weakly Supervised Document-Level Multi-Aspect Sentiment Classification , 2019, NAACL.

[26]  Tom M. Mitchell,et al.  Zero-shot Learning of Classifiers from Natural Language Quantification , 2018, ACL.

[27]  Yiming Yang,et al.  Recursive regularization for large-scale classification with hierarchical and graphical dependencies , 2013, KDD.

[28]  Jianxin Li,et al.  Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN , 2018, WWW.

[29]  Chunyan Miao,et al.  A Survey of Zero-Shot Learning , 2019, ACM Trans. Intell. Syst. Technol..

[30]  Johannes Fürnkranz,et al.  All-in Text: Learning Document, Label, and Word Representations Jointly , 2016, AAAI.

[31]  Chao Zhang,et al.  Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach , 2020, ArXiv.

[32]  Ming-Wei Chang,et al.  Importance of Semantic Representation: Dataless Classification , 2008, AAAI.

[33]  Peng Jin,et al.  Dataless Text Classification with Descriptive LDA , 2015, AAAI.

[34]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[35]  Chao Zhang,et al.  Weakly-Supervised Text Classification Using Label Names Only , 2020, EMNLP.

[36]  Jiawei Han,et al.  Weakly-Supervised Neural Text Classification , 2018, CIKM.

[37]  Dan Roth,et al.  On Dataless Hierarchical Text Classification , 2014, AAAI.

[38]  Ning Ding,et al.  Hierarchy-Aware Global Model for Hierarchical Text Classification , 2020, ACL.

[39]  Yiming Yang,et al.  Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[40]  Jian Xing,et al.  Effective Document Labeling with Very Few Seed Words: A Topic Model Approach , 2016, CIKM.

[41]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[42]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[43]  Georgios Paliouras,et al.  LSHTC: A Benchmark for Large-Scale Text Classification , 2015, ArXiv.

[44]  Rodrigo C. Barros,et al.  Hierarchical Multi-Label Classification Networks , 2018, ICML.

[45]  Hiroshi Mamitsuka,et al.  AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks , 2018, ArXiv.