论文信息 - Meta-LMTC: Meta-Learning for Large-Scale Multi-Label Text Classification - 字舞流文

Meta-LMTC: Meta-Learning for Large-Scale Multi-Label Text Classification

Large-scale multi-label text classification (LMTC) tasks often face long-tailed label distributions, where many labels have few or even no training instances. Although current methods can exploit prior knowledge to handle these few/zero-shot labels, they neglect the meta-knowledge contained in the dataset that can guide models to learn with few samples. In this paper, for the first time, this problem is addressed from a meta-learning perspective. However, the simple extension of meta-learning approaches to multi-label classification is sub-optimal for LMTC tasks due to long-tailed label distribution and coexisting of few- and zero-shot scenarios. We propose a meta-learning approach named META-LMTC. Specifically, it constructs more faithful and more diverse tasks according to well-designed sampling strategies and directly incorporates the objective of adapting to new low-resource tasks into the meta-learning phase. Extensive experiments show that META-LMTC achieves state-of-the-art performance against strong baselines and can still enhance powerful BERTlike models.

Jiajun Chen | Shujian Huang | Xinyu Dai | Ran Wang | Xi'ao Su | Siyu Long | Xinyu Dai | Jiajun Chen | Shujian Huang | Siyu Long | Ran Wang | Xi'ao Su

[1] Ming Liu,et al. Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs , 2020, EMNLP.

[2] Ion Androutsopoulos,et al. An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels , 2020, EMNLP.

[3] Trapit Bansal,et al. Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks , 2020, EMNLP.

[4] Fei Li,et al. ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network , 2019, AAAI.

[5] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[6] Artem Molchanov,et al. Generalized Inner Loop Meta-Learning , 2019, ArXiv.

[7] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[8] Jing Tang,et al. NeuralClassifier: An Open-source Neural Hierarchical Multi-label Text Classification Toolkit , 2019, ACL.

[9] Zhou Yu,et al. Domain Adaptive Dialog Generation via Meta Learning , 2019, ACL.

[10] Ion Androutsopoulos,et al. Large-Scale Multi-Label Text Classification on EU Legislation , 2019, ACL.

[11] Boi Faltings,et al. Meta-Learning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems , 2019, IJCAI.

[12] Qingyu Chen,et al. BioWordVec, improving biomedical word embeddings with subword information and MeSH , 2019, Scientific Data.

[13] Jian Sun,et al. Induction Networks for Few-Shot Text Classification , 2019, EMNLP/IJCNLP.

[14] Sergey Levine,et al. Online Meta-Learning , 2019, ICML.

[15] Christoph H. Lampert,et al. Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17] Ramakanth Kavuluru,et al. Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces , 2018, EMNLP.

[18] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.

[19] Jimeng Sun,et al. Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[20] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[21] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[22] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[23] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[24] Peter Szolovits,et al. MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[25] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .

[26] Jun Zhao,et al. Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[27] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[29] Jennifer G. Dy,et al. Medical coding classification by leveraging inter-code relationships , 2010, KDD.

[30] Grigorios Tsoumakas,et al. Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.