MaChAmp at SemEval-2023 tasks 2, 3, 4, 5, 7, 8, 9, 10, 11, and 12: On the Effectiveness of Intermediate Training on an Uncurated Collection of Datasets.

To improve the ability of language models to handle Natural Language Processing(NLP) tasks and intermediate step of pre-training has recently beenintroduced. In this setup, one takes a pre-trained language model, trains it ona (set of) NLP dataset(s), and then finetunes it for a target task. It isknown that the selection of relevant transfer tasks is important, but recentlysome work has shown substantial performance gains by doing intermediatetraining on a very large set of datasets. Most previous work uses generativelanguage models or only focuses on one or a couple of tasks and uses acarefully curated setup. We compare intermediate training with one or manytasks in a setup where the choice of datasets is more arbitrary; we use allSemEval 2023 text-based tasks. We reach performance improvements for most taskswhen using intermediate training. Gains are higher when doing intermediatetraining on single tasks than all tasks if the right transfer taskis identified. Dataset smoothing and heterogeneous batching did not lead torobust gains in our setup.

[1]  H. Frost,et al.  SemEval-2023 Task 7: Multi-Evidence Natural Language Inference for Clinical Trial Data , 2023, SEMEVAL.

[2]  Barbara Plank,et al.  SemEval-2023 Task 11: Learning with Disagreements (LeWiDi) , 2023, SEMEVAL.

[3]  Hannah Rose Kirk,et al.  SemEval-2023 Task 10: Explainable Detection of Online Sexism , 2023, SEMEVAL.

[4]  David Ifeoluwa Adelani,et al.  AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages , 2023, EMNLP.

[5]  Silvio Amir,et al.  RedHOT: A Corpus of Annotated Medical Questions, Experiences, and Claims on Social Media , 2022, FINDINGS.

[6]  David Jurgens,et al.  SemEval-2023 Task 9: Multilingual Tweet Intimacy Analysis , 2022, SEMEVAL.

[7]  S. Malmasi,et al.  MultiCoNER: A Large-scale Multilingual Dataset for Complex Named Entity Recognition , 2022, COLING.

[8]  Matt Gardner,et al.  When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning , 2022, ACL.

[9]  G. Karypis,et al.  Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning , 2022, NAACL.

[10]  Martin Potthast,et al.  Clickbait Spoiling via Question Answering and Passage Retrieval , 2022, ACL.

[11]  Sanket Vaibhav Mehta,et al.  ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning , 2021, ArXiv.

[12]  Alexander M. Rush,et al.  Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.

[13]  Chi-Jen Lu,et al.  Rethinking Why Intermediate-Task Fine-Tuning Works , 2021, EMNLP.

[14]  Noah A. Smith,et al.  Specializing Multilingual Language Models: An Empirical Study , 2021, MRL.

[15]  Luis Espinosa Anke,et al.  XLM-T: Multilingual Language Models in Twitter for Sentiment Analysis and Beyond , 2021, LREC.

[16]  Iryna Gurevych,et al.  What to Pre-Train on? Efficient Intermediate Task Selection , 2021, EMNLP.

[17]  Sonal Gupta,et al.  Muppet: Massive Multi-task Representations with Pre-Finetuning , 2021, EMNLP.

[18]  Jiaxin Pei,et al.  Quantifying Intimacy in Language , 2020, EMNLP.

[19]  Benoit Sagot,et al.  When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models , 2020, NAACL.

[20]  Nick Craswell,et al.  ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search , 2020, CIKM.

[21]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[22]  Alex Wang,et al.  Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling , 2018, ACL.

[23]  Samuel R. Bowman,et al.  Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.

[24]  Jianfeng Gao,et al.  MS MARCO: A Human Generated MAchine Reading COmprehension Dataset , 2016, CoCo@NIPS.

[25]  Giovanni Da San Martino,et al.  SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup , 2023, SEMEVAL.

[26]  Silvio Amir,et al.  SemEval-2023 Task 8: Causal Medical Claim Identification and Related PIO Frame Extraction from Social Media Posts , 2023, SEMEVAL.

[27]  Martin Potthast,et al.  SemEval-2023 Task 5: Clickbait Spoiling , 2023, SEMEVAL.

[28]  Johannes Kiesel,et al.  Identifying the Human Values behind Arguments , 2022, ACL.

[29]  R. Goot MaChAmp at SemEval-2022 Tasks 2, 3, 4, 6, 10, 11, and 12: Multi-task Multi-lingual Learning for a Pre-selected Set of Semantic Datasets , 2022, SEMEVAL.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.