Are We Really Making Much Progress in Text Classification? A Comparative Review

This study reviews and compares methods for single-label and multi-label text classification, categorized into bag-of-words, sequence-based, graph-based, and hierarchical methods. The comparison aggregates results from the literature over five single-label and seven multi-label datasets and complements them with new experiments. The findings reveal that all recently proposed graph-based and hierarchy-based methods fail to outperform pre-trained language models and sometimes perform worse than standard machine learning methods like a multilayer perceptron on a bag-of-words. To assess the true scientific progress in text classification, future work should thoroughly test against strong bag-of-words baselines and state-of-the-art pre-trained language models.

[1]  Shiliang Sun,et al.  BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-Label Text Classification , 2023, IEEE Transactions on Knowledge and Data Engineering.

[2]  Gerard de Melo,et al.  Connecting the Dots: What Graph-Based Text Representations Work Best for Text Classification using Graph Neural Networks? , 2023, ArXiv.

[3]  Yu Zhang,et al.  Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding , 2023, ArXiv.

[4]  Guoyin Wang,et al.  Text Classification via Large Language Models , 2023, ArXiv.

[5]  Jiajin Huang,et al.  Integrating information by Kullback–Leibler constraint for text classification , 2023, Neural Computing and Applications.

[6]  Haoming Jiang,et al.  Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond , 2023, ACM Trans. Knowl. Discov. Data.

[7]  B. Parlak A novel feature and class-based globalization technique for text classification , 2023, Multimedia Tools and Applications.

[8]  Kunze Wang,et al.  Graph Neural Networks for Text Classification: A Survey , 2023, ArXiv.

[9]  Wei Liu,et al.  Research on the Automatic Subject-Indexing Method of Academic Papers Based on Climate Change Domain Ontology , 2023, Sustainability.

[10]  José Márcio Duarte,et al.  A review of semi-supervised learning for text classification , 2023, Artificial Intelligence Review.

[11]  Niels van der Heijden,et al.  FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs , 2023, ArXiv.

[12]  Joe Tekli,et al.  Supervised term-category feature weighting for improved text classification , 2022, Knowl. Based Syst..

[13]  A. Scherp,et al.  Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets , 2022, ArXiv.

[14]  N. Madhavji,et al.  A Comparison of SVM against Pre-trained Language Models (PLMs) for Text Classification Tasks , 2022, LOD.

[15]  Xianghua Li,et al.  Integration of global and local information for text classification , 2022, Neural Computing and Applications.

[16]  A. Cristea,et al.  Contrastive Learning with Heterogeneous Graph Attention Networks on Short Text Classification , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[17]  Quynh Tran,et al.  Comparing the Robustness of Classical and Deep Learning Techniques for Text Classification , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[18]  Baocai Yin,et al.  Hierarchical Graph Convolutional Networks for Structured Long Document Classification. , 2022, IEEE transactions on neural networks and learning systems.

[19]  Lingfei Wu,et al.  TeKo: Text-Rich Graph Neural Networks with External Knowledge , 2022, IEEE transactions on neural networks and learning systems.

[20]  Kunze Wang,et al.  InducT-GCN: Inductive Graph Convolutional Networks for Text Classification , 2022, 2022 26th International Conference on Pattern Recognition (ICPR).

[21]  Zhigang Meng,et al.  Simplified-Boosting Ensemble Convolutional Network for Text Classification , 2022, Neural Processing Letters.

[22]  Fuzhen Zhuang,et al.  Exploiting Global and Local Hierarchies for Hierarchical Text Classification , 2022, EMNLP.

[23]  Chen Wang,et al.  An adaptive convolution with label embedding for text classification , 2022, Applied Intelligence.

[24]  Philip S. Yu,et al.  A Survey on Text Classification: From Traditional to Deep Learning , 2022, ACM Trans. Intell. Syst. Technol..

[25]  Zhaoyang Deng,et al.  Text Classification with Attention Gated Graph Neural Network , 2022, Cognitive Computation.

[26]  Bingxin Xue,et al.  The Study on the Text Classification Based on Graph Convolutional Network and BiLSTM , 2022, ICCAI.

[27]  Houfeng Wang,et al.  Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification , 2022, ACL.

[28]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[29]  Maunendra Sankar Desarkar,et al.  Supervised Graph Contrastive Pretraining for Text Classification , 2021, ArXiv.

[30]  Sun Kim,et al.  Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification , 2021, AAAI.

[31]  Qiang Shen,et al.  A Sequential Graph Neural Network for Short Text Classification , 2021, Algorithms.

[32]  Dejing Dou,et al.  Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification , 2021, EMNLP.

[33]  Shuigeng Zhou,et al.  Weakly-supervised Text Classification Based on Keyword Graph , 2021, EMNLP.

[34]  Paul Michel,et al.  Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative , 2021, ICLR.

[35]  Abdullatif Köksal,et al.  Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution , 2021, EMNLP.

[36]  A. Scherp,et al.  Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP , 2021, ACL.

[37]  M. de Rijke,et al.  sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel Classification , 2021, Trans. Mach. Learn. Res..

[38]  Chang Zhou,et al.  Are we really making much progress?: Revisiting, benchmarking and refining heterogeneous graph neural networks , 2021, KDD.

[39]  Christian Reuter,et al.  A Survey on Data Augmentation for Text Classification , 2021, ACM Comput. Surv..

[40]  Jure Leskovec,et al.  GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings , 2021, ICML.

[41]  Yu Meng,et al.  TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names , 2021, NAACL.

[42]  Quoc V. Le,et al.  Pay Attention to MLPs , 2021, NeurIPS.

[43]  Jiwei Li,et al.  BertGCN: Transductive Text Classification by Combining GNN and BERT , 2021, FINDINGS.

[44]  Luke Melas-Kyriazi,et al.  Do You Even Need Attention? A Stack of Feed-Forward Layers Does Surprisingly Well on ImageNet , 2021, ArXiv.

[45]  A. Dosovitskiy,et al.  MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.

[46]  Mario Giacobini,et al.  A review of methods for imbalanced multi-label classification , 2021, Pattern Recognit..

[47]  Waldemar Karwowski,et al.  Text Guide: Improving the Quality of Long Text Classification by a Text Selection Method Based on Feature Importance , 2021, IEEE Access.

[48]  Douwe Kiela,et al.  Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little , 2021, EMNLP.

[49]  Xiaodan Zhu,et al.  Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning , 2021, NAACL.

[50]  Thomas Pellegrini,et al.  Fast Threshold Optimization for Multi-Label Audio Tagging Using Surrogate Gradient Learning , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[51]  Michael Fairbank,et al.  A Comparison of Deep-Learning Methods for Analysing and Predicting Business Processes , 2021, 2021 International Joint Conference on Neural Networks (IJCNN).

[52]  Ivor W. Tsang,et al.  The Emerging Trends of Multi-Label Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Huan Liu,et al.  Be More with Less: Hypergraph Attention Networks for Inductive Text Classification , 2020, EMNLP.

[54]  Honghan Wu,et al.  Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation , 2020, J. Biomed. Informatics.

[55]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[56]  Pan Zhou,et al.  Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning , 2020, NeurIPS.

[57]  Ion Androutsopoulos,et al.  An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels , 2020, EMNLP.

[58]  Lihui Chen,et al.  LA-HCN: Label-based Attention for Hierarchical Multi-label TextClassification Neural Network , 2020, Expert Syst. Appl..

[59]  William L. Hamilton Graph Representation Learning , 2020, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[60]  Sundararajan Sellamanickam,et al.  HeteGCN: Heterogeneous Graph Convolutional Networks for Text Classification , 2020, WSDM.

[61]  Ning Ding,et al.  Hierarchy-Aware Global Model for Hierarchical Text Classification , 2020, ACL.

[62]  Shengfei Lyu,et al.  Combine Convolution with Recurrent Networks for Text Classification , 2020, ArXiv.

[63]  Jianfeng Gao,et al.  DeBERTa: Decoding-enhanced BERT with Disentangled Attention , 2020, ICLR.

[64]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[65]  Eduardo C. Garrido-Merch'an,et al.  Comparing BERT against traditional machine learning text classification , 2020, ArXiv.

[66]  Alan S. Cowen,et al.  GoEmotions: A Dataset of Fine-Grained Emotions , 2020, ACL.

[67]  Yufeng Zhang,et al.  Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks , 2020, ACL.

[68]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[69]  Yiming Yang,et al.  MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices , 2020, ACL.

[70]  Jiwei Li,et al.  Description Based Text Classification with Reinforcement Learning , 2020, ICML.

[71]  Xien Liu,et al.  Tensor Graph Convolutional Networks for Text Classification , 2020, AAAI.

[72]  Boaz Barak,et al.  Deep double descent: where bigger models and more data hurt , 2019, ICLR.

[73]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[74]  Houfeng Wang,et al.  Text Level Graph Neural Network for Text Classification , 2019, EMNLP.

[75]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[76]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[77]  Xin Jiang,et al.  TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.

[78]  Dan Roth,et al.  Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach , 2019, EMNLP.

[79]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[80]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[81]  Dietmar Jannach,et al.  Are we really making much progress? A worrying analysis of recent neural recommendation approaches , 2019, RecSys.

[82]  Lei Wang,et al.  Convolutional Recurrent Neural Networks for Text Classification , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[83]  Noah A. Smith,et al.  Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[84]  Donald E. Brown,et al.  Text Classification Algorithms: A Survey , 2019, Inf..

[85]  Xin Liu,et al.  Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification , 2019, WWW.

[86]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[87]  Ammar Ismael Kadhim Survey on supervised machine learning techniques for automatic text classification , 2019, Artificial Intelligence Review.

[88]  Stephan Günnemann,et al.  Pitfalls of Graph Neural Network Evaluation , 2018, ArXiv.

[89]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[90]  Yann LeCun,et al.  Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks , 2018, ArXiv.

[91]  Guoyin Wang,et al.  Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.

[92]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[93]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[94]  Xiaoyan Zhu,et al.  Sentiment Analysis by Capsules , 2018, WWW.

[95]  Ansgar Scherp,et al.  Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text , 2018, JCDL.

[96]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[97]  Donald E. Brown,et al.  HDLTex: Hierarchical Deep Learning for Text Classification , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[98]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[99]  Ansgar Scherp,et al.  Using Titles vs. Full-text as Source for Automated Semantic Document Annotation , 2017, K-CAP.

[100]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[101]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[102]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[103]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[104]  Jun Wang,et al.  Bayesian Performance Comparison of Text Classifiers , 2016, SIGIR.

[105]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[106]  Giosuè Baggio,et al.  The emergence of word order and morphology in compositional languages via multigenerational signaling games , 2016 .

[107]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[108]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[109]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[110]  Ye Zhang,et al.  A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification , 2015, IJCNLP.

[111]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[112]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[113]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[114]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[115]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[116]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[117]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[118]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[119]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[120]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[121]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .

[122]  Hai Jin,et al.  MSVM-kNN: Combining SVM and k-NN for Multi-class Text Classification , 2008, IEEE International Workshop on Semantic Computing and Systems.

[123]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[124]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[125]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[126]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[127]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[128]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[129]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[130]  Y. Kompatsiaris,et al.  Domain-Aligned Data Augmentation for Low-Resource and Imbalanced Text Classification , 2023, ECIR.

[131]  A. Scherp,et al.  Are We Really Making Much Progress? Bag-of-Words vs. Sequence vs. Graph vs. Hierarchy for Single-label and Multi-label Text Classification , 2023 .

[132]  Yi-Shin Chen,et al.  ConTextING: Granting Document-Wise Contextual Embeddings to Graph Neural Networks for Inductive Text Classification , 2022, COLING.

[133]  D. Dou,et al.  Simplified Graph Learning for Inductive Short Text Classification , 2022, EMNLP.

[134]  Xiangzhi Liu,et al.  Inductive Light Graph Convolution Network for Text Classification Based on Word-Label Graph , 2022, Intelligent Information Processing.

[135]  F. Fleuret,et al.  HyperMixer: An MLP-based Green AI Alternative to Transformers , 2022, ArXiv.

[136]  Huayi Zhan,et al.  KGAT: An Enhanced Graph-Based Model for Text Classification , 2022, NLPCC.

[137]  Wen Zhang,et al.  Deep Hierarchical Product Classification Based on Pre-Trained Multilingual Knowledge , 2021, IEEE Data Eng. Bull..

[138]  Fausto Giunchiglia,et al.  Deep Attention Diffusion Graph Neural Networks for Text Classification , 2021, EMNLP.

[139]  E. Cambria,et al.  Deep Learning--based Text Classification , 2020, ACM Comput. Surv..

[140]  T. Menzies,et al.  When SIMPLE is better than complex: A case study on deep learning for predicting Bugzilla issue close time , 2021, ArXiv.

[141]  Jiangyue Yan,et al.  Hierarchy-aware Label Semantics Matching Network for Hierarchical Text Classification , 2021, ACL.

[142]  D. Croce,et al.  Multi-task and Generative Adversarial Learning for Robust and Sustainable Text Classification , 2021, AI*IA.

[143]  Zhen Cui,et al.  Circulant Tensor Graph Convolutional Network for Text Classification , 2021, ACPR.

[144]  Zhiyong Li,et al.  Document and Word Representations Generated by Graph Convolutional Network and BERT for Short Text Classification , 2020, ECAI.

[145]  M. Selvakumar,et al.  MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network , 2020, ICAART.

[146]  Chang Zhou,et al.  CogLTX: Applying BERT to Long Texts , 2020, NeurIPS.

[147]  Yuefeng Li,et al.  A survey on text classification and its applications , 2020, Web Intell..

[148]  Rohit Babbar,et al.  Why state-of-the-art deep learning barely works as good as a linear classifier in extreme multi-label text classification , 2020, ESANN.

[149]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[150]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[151]  Johannes Fürnkranz,et al.  Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification , 2017, NIPS.

[152]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[153]  Mirella Lapata,et al.  Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05) , 2005, ACL 2005.

[154]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[155]  Hui Liu,et al.  Supervised Contrastive Learning with Term Weighting for Improving Chinese Text Classification , 2022, Tsinghua Science and Technology.

[156]  R. Sarasu,et al.  SF-CNN: Deep Text Classification and Retrieval for Text Documents , 2022, Intelligent Automation & Soft Computing.