Using the Past Knowledge to Improve Sentiment Classification

This paper studies sentiment classification in the lifelong learning setting that incrementally learns a sequence of sentiment classification tasks. It proposes a new lifelong learning model (called L2PG) that can retain and selectively transfer the knowledge learned in the past to help learn the new task. A key innovation of this proposed model is a novel parameter-gate (p-gate) mechanism that regulates the flow or transfer of the previously learned knowledge to the new task. Specifically, it can selectively use the network parameters (which represent the retained knowledge gained from the previous tasks) to assist the learning of the new task t. Knowledge distillation is also employed in the process to preserve the past knowledge by approximating the network output at the state when task t-1 was learned. Experimental results show that L2PG outperforms strong baselines, including even multiple task learning.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Bing Liu,et al.  Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation , 2018, ICLR.

[3]  Yee Whye Teh,et al.  Progress & Compress: A scalable framework for continual learning , 2018, ICML.

[4]  Rui Xia,et al.  Distantly Supervised Lifelong Learning for Large-Scale Social Media Sentiment Analysis , 2017, IEEE Transactions on Affective Computing.

[5]  Yandong Guo,et al.  Incremental Classifier Learning with Generative Adversarial Networks , 2018, ArXiv.

[6]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[7]  Shuai Wang,et al.  Lifelong Learning Memory Networks for Aspect Sentiment Classification , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[8]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[9]  Rama Chellappa,et al.  Learning Without Memorizing , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xuanjing Huang,et al.  Adversarial Multi-task Learning for Text Classification , 2017, ACL.

[11]  Alexandros Karatzoglou,et al.  Overcoming Catastrophic Forgetting with Hard Attention to the Task , 2018 .

[12]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[14]  Yoshua Bengio,et al.  An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[15]  Eric Eaton,et al.  ELLA: An Efficient Lifelong Learning Algorithm , 2013, ICML.

[16]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[17]  Bing Liu,et al.  Feature Projection for Improved Text Classification , 2020, ACL.

[18]  Taesup Moon,et al.  Uncertainty-based Continual Learning with Adaptive Regularization , 2019, NeurIPS.

[19]  Enhong Chen,et al.  Interactive Attention Transfer Network for Cross-Domain Sentiment Classification , 2018 .

[20]  Bing Liu,et al.  Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data , 2014, ICML.

[21]  Hao Wang,et al.  Forward and Backward Knowledge Transfer for Sentiment Classification , 2019, ACML.

[22]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[23]  Guoyin Wang,et al.  Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.

[24]  Yu Zhang,et al.  Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification , 2018, AAAI.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[27]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[28]  Enhong Chen,et al.  Sentiment Classification by Leveraging the Shared Knowledge from a Sequence of Domains , 2019, DASFAA.

[29]  Bing Liu,et al.  Lifelong Learning for Sentiment Classification , 2015, ACL.

[30]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.

[31]  Qiang Yang,et al.  Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.

[32]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[33]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .