论文信息 - MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification - 字舞流文

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

This paper presents MixText, a semi-supervised learning method for text classification, which uses our newly designed data augmentation method called TMix. TMix creates a large amount of augmented training samples by interpolating text in hidden space. Moreover, we leverage recent advances in data augmentation to guess low-entropy labels for unlabeled data, hence making them as easy to use as labeled this http URL mixing labeled, unlabeled and augmented data, MixText significantly outperformed current pre-trained and fined-tuned models and other state-of-the-art semi-supervised learning methods on several text classification benchmarks. The improvement is especially prominent when supervision is extremely limited. We have publicly released our code at this https URL.

Diyi Yang | Zichao Yang | Jiaao Chen | Diyi Yang | Zichao Yang | Jiaao Chen

[1] Ioannis Mitliagkas,et al. Manifold Mixup: Better Representations by Interpolating Hidden States , 2018, ICML.

[2] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[3] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[4] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Zhou Yu,et al. Incorporating Structured Commonsense Knowledge in Story Completion , 2018, AAAI.

[6] Kevin Gimpel,et al. Variational Sequential Labelers for Semi-Supervised Learning , 2019, EMNLP.

[7] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[8] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[9] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[10] Ming-Wei Chang,et al. Importance of Semantic Representation: Dataless Classification , 2008, AAAI.

[11] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.

[12] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[13] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[14] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[15] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[16] Mark Steedman,et al. Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[17] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[18] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[19] Kai Zou,et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[20] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[21] Yoshua Bengio,et al. Interpolation Consistency Training for Semi-Supervised Learning , 2019, IJCAI.

[22] Mihai Surdeanu,et al. An Exploration of Three Lightly-supervised Representation Learning Approaches for Named Entity Classification , 2018, COLING.

[23] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[25] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.

[26] Andrew M. Dai,et al. Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[27] Jiawei Han,et al. Weakly-Supervised Neural Text Classification , 2018, CIKM.

[28] Mihai Surdeanu,et al. Exploration of Noise Strategies in Semi-supervised Named Entity Classification , 2019, *SEMEVAL.

[29] Luis Perez,et al. The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[30] Quoc V. Le,et al. Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[31] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.

[32] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[33] Carolyn Penstein Rosé,et al. Weakly Supervised Role Identification in Teamwork Interactions , 2015, ACL.

[34] Quoc V. Le,et al. Semi-Supervised Sequence Modeling with Cross-View Training , 2018, EMNLP.

[35] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[36] Nitesh V. Chawla,et al. Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains , 2011, J. Artif. Intell. Res..

[37] Partha Talukdar,et al. Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation , 2019, NAACL.

[38] Zhiting Hu,et al. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[39] Diyi Yang,et al. Let’s Make Your Request More Persuasive: Modeling Persuasive Strategies via Semi-Supervised Neural Nets on Crowdfunding Platforms , 2019, NAACL.

[40] Roland Vollgraf,et al. Pooled Contextualized Embeddings for Named Entity Recognition , 2019, NAACL.

[41] Yoshua Bengio,et al. Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[42] Noah A. Smith,et al. Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[43] Rotem Dror,et al. The Hitchhiker’s Guide to Testing Statistical Significance in Natural Language Processing , 2018, ACL.

[44] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[45] Dong-Hyun Lee,et al. Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[46] Diyi Yang,et al. Semi-supervised Models via Data Augmentation for Classifying Interactive Affective Responses , 2020, AffCon@AAAI.

[47] Yoshua Bengio,et al. GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning , 2019, ArXiv.