论文信息 - OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

OCHADAI-KYOTO at SemEval-2021 Task 1: Enhancing Model Generalization and Robustness for Lexical Complexity Prediction

We propose an ensemble model for predicting the lexical complexity of words and multiword expressions (MWEs). The model receives as input a sentence with a target word or MWE and outputs its complexity score. Given that a key challenge with this task is the limited size of annotated data, our model relies on pretrained contextual representations from different state-of-the-art transformer-based language models (i.e., BERT and RoBERTa), and on a variety of training methods for further enhancing model generalization and robustness: multi-step fine-tuning and multi-task learning, and adversarial training. Additionally, we propose to enrich contextual representations by adding hand-crafted features during training. Our model achieved competitive results and ranked among the top-10 systems in both sub-tasks.

Ichiro Kobayashi | Fei Cheng | Yuki Taya | Lis Kanashiro Pereira

[1] Lucia Specia,et al. A Report on the Complex Word Identification Shared Task 2018 , 2018, BEA@NAACL-HLT.

[2] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4] Yong Cheng,et al. Robust Neural Machine Translation with Doubly Adversarial Inputs , 2019, ACL.

[5] Jianfeng Gao,et al. SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization , 2019, ACL.

[6] Marcos Zampieri,et al. Predicting Lexical Complexity in English Texts , 2021, ArXiv.

[7] Thomas Lukasiewicz,et al. A Surprisingly Robust Trick for the Winograd Schema Challenge , 2019, ACL.

[8] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[9] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10] Lucia Specia,et al. Complex Word Identification: Challenges in Data Annotation and System Performance , 2017, NLP-TEA@IJCNLP.

[11] Yuanzhi Li,et al. Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning , 2020, International Conference on Learning Representations.

[12] Jianfeng Gao,et al. Adversarial Training for Large Neural Language Models , 2020, ArXiv.

[13] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14] Marcos Zampieri,et al. SemEval-2021 Task 1: Lexical Complexity Prediction , 2021, SEMEVAL.

[15] Mark Steedman,et al. A massively parallel corpus: the Bible in 100 languages , 2014, Lang. Resour. Evaluation.

[16] K. Bretonnel Cohen,et al. Concept annotation in the CRAFT corpus , 2012, BMC Bioinformatics.

[17] Zihang Dai,et al. Wiki-40B: Multilingual Language Model Dataset , 2020, LREC.

[18] Xiaodong Liu,et al. Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.

[19] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[20] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .

[21] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[22] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[23] Marcos Zampieri,et al. CompLex - A New Corpus for Lexical Complexity Predicition from Likert Scale Data , 2020, READI.

[24] Samuel R. Bowman,et al. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.

[25] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[26] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[27] Tom Goldstein,et al. FreeLB: Enhanced Adversarial Training for Language Understanding , 2019, ICLR 2020.

[28] Masayuki Asahara,et al. Adversarial Training for Commonsense Inference , 2020, REPL4NLP.

[29] Lucia Specia,et al. SemEval 2016 Task 11: Complex Word Identification , 2016, *SEMEVAL.

[30] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.