AMU-EURANOVA at CASE 2021 Task 1: Assessing the stability of multilingual BERT

This paper explains our participation in task 1 of the CASE 2021 shared task. This task is about multilingual event extraction from news. We focused on sub-task 4, event information extraction. This sub-task has a small training dataset and we fine-tuned a multilingual BERT to solve this sub-task. We studied the instability problem on the dataset and tried to mitigate it.

[1]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[2]  Ion Stoica,et al.  Tune: A Research Platform for Distributed Model Selection and Training , 2018, ArXiv.

[3]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[4]  Charles Elkan,et al.  Optimal Thresholding of Classifiers to Maximize F1 Measure , 2014, ECML/PKDD.

[5]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[6]  Ali Farhadi,et al.  Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.

[7]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[8]  Samuel Broscheit,et al.  Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking , 2019, CoNLL.

[9]  Nuno Seco,et al.  HAREM: An Advanced NER Evaluation Contest for Portuguese , 2006, LREC.

[10]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[11]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[12]  Deniz Yuret,et al.  Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report , 2021, CASE.

[13]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[14]  Ali Safaya,et al.  Automated Extraction of Socio-political Events from News (AESPEN): Workshop and Shared Task Report , 2020, AESPEN.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Farhana Ferdousi Liza,et al.  Multilingual Protest News Detection - Shared Task 1, CASE 2021 , 2021, CASE.

[17]  Osman Mutlu,et al.  Overview of CLEF 2019 Lab ProtestNews: Extracting Protests from News in a Cross-context setting , 2019, CLEF.

[18]  Sampo Pyysalo,et al.  Biomedical Named Entity Recognition with Multilingual BERT , 2019, EMNLP.

[19]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[20]  Marius Mosbach,et al.  On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines , 2020, ArXiv.

[21]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .