LETS: A Label-Efficient Training Scheme for Aspect-Based Sentiment Analysis by Using a Pre-Trained Language Model

Recently proposed pre-trained language models can be easily fine-tuned to a wide range of downstream tasks. However, a large-scale labelled task-specific dataset is required for fine-tuning creating a bottleneck in the development process of machine learning applications. To foster a fast development by reducing manual labelling efforts, we propose a Label-Efficient Training Scheme (LETS). The proposed LETS consists of three elements: (i) task-specific pre-training to exploit unlabelled task-specific corpus data, (ii) label augmentation to maximise the utility of labelled data, and (iii) active learning to label data strategically. In this paper, we apply LETS to a novel aspect-based sentiment analysis (ABSA) use-case for analysing the reviews of the health-related program supporting people to improve their sleep quality. We validate the proposed LETS on a custom health-related program-reviews dataset and another ABSA benchmark dataset. Experimental results show that the LETS can reduce manual labelling efforts 2-3 times compared to labelling with random sampling on both datasets. The LETS also outperforms other state-of-the-art active learning methods. Furthermore, the experimental results show that LETS can contribute to better generalisability with both datasets compared to other methods thanks to the task-specific pre-training and the proposed label augmentation. We expect this work could contribute to the natural language processing (NLP) domain by addressing the issue of the high cost of manually labelling data. Also, our work could contribute to the healthcare domain by introducing a new potential application of NLP techniques.

[1]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[2]  K. Vijay-Shanker,et al.  A Method for Stopping Active Learning Based on Stabilizing Predictions and the Need for User-Adjustable Stopping , 2009, CoNLL.

[3]  Abeer Alsadoon,et al.  Deep Learning for Aspect-Based Sentiment Analysis: A Comparative Review , 2019, Expert Syst. Appl..

[4]  Luyao Huang,et al.  Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence , 2019, NAACL.

[5]  Eyal Shnarch,et al.  Active Learning for BERT: An Empirical Study , 2020, EMNLP.

[6]  Caroline Brun,et al.  Aspect Based Sentiment Analysis into the Wild , 2018, WASSA@EMNLP.

[7]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[8]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[9]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[10]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[12]  Tao Li,et al.  Aspect Based Sentiment Analysis with Gated Convolutional Networks , 2018, ACL.

[13]  Alexander Panchenko,et al.  Active Learning with Deep Pre-trained Models for Sequence Tagging of Clinical and Biomedical Texts , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[14]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[15]  Daniel Reker,et al.  Practical considerations for active machine learning in drug discovery. , 2019, Drug discovery today. Technologies.

[16]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[17]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[18]  Jingbo Zhu,et al.  Confidence-based stopping criteria for active learning for data annotation , 2010, TSLP.

[19]  John G. Breslin,et al.  A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis , 2016, EMNLP.

[20]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[21]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[22]  Philip S. Yu,et al.  BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis , 2019, NAACL.

[23]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[24]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[25]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[26]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[27]  Suresh Manandhar,et al.  SemEval-2014 Task 4: Aspect Based Sentiment Analysis , 2014, *SEMEVAL.

[28]  Wolfgang Lehner,et al.  WeakAL: Combining Active Learning and Weak Supervision , 2020, DS.

[29]  Zachary C. Lipton,et al.  Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study , 2018, EMNLP.

[30]  Jian Wu,et al.  Multi-Label Active Learning Algorithms for Image Classification , 2020, ACM Comput. Surv..

[31]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[32]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[33]  Hsuan-Tien Lin,et al.  Cold-start Active Learning through Self-Supervised Language Modeling , 2020, EMNLP.

[34]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[35]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[36]  Andreas Nürnberger,et al.  The Power of Ensembles for Active Learning in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  C. Lee Giles,et al.  Learning on the border: active learning in imbalanced data classification , 2007, CIKM '07.

[40]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[41]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .