论文信息 - Adjusting BERT's Pooling Layer for Large-Scale Multi-Label Text Classification

Adjusting BERT's Pooling Layer for Large-Scale Multi-Label Text Classification

In this paper, we present our experiments with BERT models in the task of Large-scale Multi-label Text Classification (LMTC). In the LMTC task, each text document can have multiple class labels, while the total number of classes is in the order of thousands. We propose a pooling layer architecture on top of BERT models, which improves the quality of classification by using information from the standard [CLS] token in combination with pooled sequence output. We demonstrate the improvements on Wikipedia datasets in three different languages using public pre-trained BERT models.

Jan Svec | Pavel Ircing | Lubos Smídl | Jan Lehecka

[1] Ramesh Nallapati,et al. Universal Text Representation from BERT: An Empirical Study , 2019, ArXiv.

[2] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[3] Grigorios Tsoumakas,et al. WISE 2014 Challenge: Multi-label Classification of Print Media Articles to Topics , 2014, WISE.

[4] Sebastian Stabinger,et al. Adapt or Get Left Behind: Domain Adaptation through BERT Language Model Finetuning for Aspect-Target Sentiment Classification , 2020, LREC.

[5] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[6] Ion Androutsopoulos,et al. Large-Scale Multi-Label Text Classification on EU Legislation , 2019, ACL.

[7] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.

[8] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9] Xuanjing Huang,et al. How to Fine-Tune BERT for Text Classification? , 2019, CCL.