SemEval-2023 Task 2: Fine-grained Multilingual Named Entity Recognition (MultiCoNER 2)

We present the findings of SemEval-2023 Task 2 on Fine-grained Multilingual Named Entity Recognition (MultiCoNER 2). Divided into 13 tracks, the task focused on methods to identify complex fine-grained named entities (like WRITTENWORK, VEHICLE, MUSICALGRP) across 12 languages, in both monolingual and multilingual scenarios, as well as noisy settings. The task used the MultiCoNER V2 dataset, composed of 2.2 million instances in Bangla, Chinese, English, Farsi, French, German, Hindi, Italian., Portuguese, Spanish, Swedish, and Ukrainian. MultiCoNER 2 was one of the most popular tasks of SemEval-2023. It attracted 842 submissions from 47 teams, and 34 teams submitted system papers. Results showed that complex entity types such as media titles and product names were the most challenging. Methods fusing external knowledge into transformer models achieved the best performance, and the largest gains were on the Creative Work and Group classes, which are still challenging even with external knowledge. Some fine-grained classes proved to be more challenging than others, such as SCIENTIST, ARTWORK, and PRIVATECORP. We also observed that noisy data has a significant impact on model performance, with an average drop of 10% on the noisy subset. The task highlights the need for future research on improving NER robustness on noisy data containing complex entities.

[1]  Longxuan Ma,et al.  PAI at SemEval-2023 Task 2: A Universal System for Named Entity Recognition with External Entity Information , 2023, SEMEVAL.

[2]  Fei Huang,et al.  DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition , 2023, SEMEVAL.

[3]  Rahul Mehta,et al.  LLM-RM at SemEval-2023 Task 2: Multilingual Complex NER Using XLM-RoBERTa , 2023, SEMEVAL.

[4]  S. Bergler,et al.  CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling Approaches for NER , 2023, SEMEVAL.

[5]  Jia-Chen Gu,et al.  USTC-NELSLIP at SemEval-2023 Task 2: Statistical Construction and Dual Adaptation of Gazetteer for Multilingual Complex NER , 2023, International Workshop on Semantic Evaluation.

[6]  Jon Ander Campos,et al.  IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named Entity Recognition Using Knowledge Bases , 2023, SEMEVAL.

[7]  Amir Pouran Ben Veyseh,et al.  ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning , 2023, EMNLP.

[8]  S. Malmasi,et al.  MultiCoNER: A Large-scale Multilingual Dataset for Complex Named Entity Recognition , 2022, COLING.

[9]  Jianlin Su,et al.  Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition , 2022, ArXiv.

[10]  Hiroyuki Shindo,et al.  LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention , 2020, EMNLP.

[11]  Nicola De Cao,et al.  KILT: a Benchmark for Knowledge Intensive Language Tasks , 2020, NAACL.

[12]  Yuqing Tang,et al.  Multilingual Translation with Extensible Multilingual Pretraining and Finetuning , 2020, ArXiv.

[13]  Barbara Plank,et al.  Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP , 2020, EACL.

[14]  Iryna Gurevych,et al.  AdapterFusion: Non-Destructive Task Composition for Transfer Learning , 2020, EACL.

[15]  Masoud Jalili Sabet,et al.  SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings , 2020, FINDINGS.

[16]  Fei Wu,et al.  Dice Loss for Data-imbalanced NLP Tasks , 2019, ACL.

[17]  Myle Ott,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[18]  Jiwei Li,et al.  A Unified MRC Framework for Named Entity Recognition , 2019, ACL.

[19]  Stephen D. Mayhew,et al.  ner and pos when nothing is capitalized , 2019, EMNLP.

[20]  Leon Derczynski,et al.  Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition , 2017, NUT@EMNLP.

[21]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[23]  Jianfeng Gao,et al.  MS MARCO: A Human Generated MAchine Reading COmprehension Dataset , 2016, CoCo@NIPS.

[24]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[25]  Jinho D. Choi,et al.  Targetable Named Entity Recognition in Social Media , 2014, ArXiv.

[26]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[27]  R. Goot MaChAmp at SemEval-2023 tasks 2, 3, 4, 5, 7, 8, 9, 10, 11, and 12: On the Effectiveness of Intermediate Training on an Uncurated Collection of Datasets. , 2023, SEMEVAL.

[28]  Ilaine Wang,et al.  Ertim at SemEval-2023 Task 2: Fine-tuning of Transformer Language Models and External Knowledge Leveraging for NER in Farsi, English, French and Chinese , 2023, SEMEVAL.

[29]  Siena Biales SAB at SemEval-2023 Task 2: Does Linguistic Information Aid in Named Entity Recognition? , 2023, SEMEVAL.

[30]  Biswajit Paul,et al.  CAIR-NLP at SemEval-2023 Task 2: A Multi-Objective Joint Learning System for Named Entity Recognition , 2023, SEMEVAL.

[31]  Guntis Barzdins,et al.  RIGA at SemEval-2023 Task 2: NER Enhanced with GPT-3 , 2023, SEMEVAL.

[32]  Jiacheng Li,et al.  NetEase.AI at SemEval-2023 Task 2: Enhancing Complex Named Entities Recognition in Noisy Scenarios via Text Error Correction and External Knowledge , 2023, SEMEVAL.

[33]  Geeth de Mel,et al.  NLPeople at SemEval-2023 Task 2: A Staged Approach for Multilingual Named Entity Recognition , 2023, SEMEVAL.

[34]  Raphael Troncy,et al.  D2KLab at SemEval-2023 Task 2: Leveraging T-NER to Develop a Fine-Tuned Multilingual Model for Complex Named Entity Recognition , 2023, SEMEVAL.

[35]  Abir Chakraborty RGAT at SemEval-2023 Task 2: Named Entity Recognition Using Graph Attention Network , 2023, SEMEVAL.

[36]  Antonia Höfer,et al.  Minanto at SemEval-2023 Task 2: Fine-tuning XLM-RoBERTa for Named Entity Recognition on English Data , 2023, SEMEVAL.

[37]  Michal Marcinczuk,et al.  CodeNLP at SemEval-2023 Task 2: Data Augmentation for Named Entity Recognition by Combination of Sequence Generation Strategies , 2023, SEMEVAL.

[38]  Bin Dong,et al.  SRCB at SemEval-2023 Task 2: A System of Complex Named Entity Recognition with External Knowledge , 2023, SEMEVAL.

[39]  N. Chatterjee,et al.  IITD at SemEval-2023 Task 2: A Multi-Stage Information Retrieval Approach for Fine-Grained Named Entity Recognition , 2023, International Workshop on Semantic Evaluation.

[40]  Olivier Ferret,et al.  MEERQAT-IRIT at SemEval-2023 Task 2: Leveraging Contextualized Tag Descriptors for Multilingual Named Entity Recognition , 2023, SEMEVAL.

[41]  Ohnmar Htun,et al.  Sakura at SemEval-2023 Task 2: Data Augmentation via Translation , 2023, SEMEVAL.

[42]  Xiaobing Zhou,et al.  YNUNLP at SemEval-2023 Task 2: The Pseudo Twin Tower Pre-training Model for Chinese Named Entity Recognition , 2023, International Workshop on Semantic Evaluation.

[43]  Partha Basuchowdhuri,et al.  MLlab4CS at SemEval-2023 Task 2: Named Entity Recognition in Low-resource Language Bangla Using Multilingual Language Models , 2023, SEMEVAL.

[44]  Md. Zobaer Hossain,et al.  garNER at SemEval-2023: Simplified Knowledge Augmentation for Multilingual Complex Named Entity Recognition , 2023, SEMEVAL.

[45]  Pramit Bhattacharyya,et al.  LSJSP at SemEval-2023 Task 2: FTBC: A FastText based framework with pre-trained BERT for NER , 2023, SEMEVAL.

[46]  Xiangfeng Meng,et al.  Samsung Research China - Beijing at SemEval-2023 Task 2: An AL-R Model for Multilingual Complex Named Entity Recognition , 2023, SEMEVAL.

[47]  Carlos-Emiliano González-Gallardo,et al.  L3I++ at SemEval-2023 Task 2: Prompting for Multilingual Complex Named Entity Recognition , 2023, SEMEVAL.

[48]  Daniela Gîfu,et al.  FII_Better at SemEval-2023 Task 2: MultiCoNER II Multilingual Complex Named Entity Recognition , 2023, SEMEVAL.

[49]  Zhengyi Guan,et al.  Janko at SemEval-2023 Task 2: Bidirectional LSTM Model Based on Pre-training for Chinese Named Entity Recognition , 2023, SEMEVAL.

[50]  Hamidreza Baradaran Kashani,et al.  Sartipi-Sedighin at SemEval-2023 Task 2: Fine-grained Named Entity Recognition with Pre-trained Contextual Language Models and Data Augmentation from Wikipedia , 2023, SEMEVAL.

[51]  Huichen Yang,et al.  KDDIE at SemEval-2023 Task 2: External Knowledge Injection for Named Entity Recognition , 2023, SEMEVAL.

[52]  Phu Gia Hoang,et al.  VBD_NLP at SemEval-2023 Task 2: Named Entity Recognition Systems Enhanced by BabelNet and Wikipedia , 2023, SEMEVAL.

[53]  Jiawei Jiang,et al.  PAI at SemEval-2023 Task 4: A General Multi-label Classification System with Class-balanced Loss Function and Ensemble Module , 2023, SEMEVAL.

[54]  U. Tiwary,et al.  Silp_nlp at SemEval-2023 Task 2: Cross-lingual Knowledge Transfer for Mono-lingual Learning , 2023, SEMEVAL.

[55]  Edgar Andres Santamaria IXA at SemEval-2023 Task 2: Baseline Xlm-Roberta-base Approach , 2023, SEMEVAL.

[56]  S. Malmasi,et al.  SemEval-2022 Task 11: Multilingual Complex Named Entity Recognition (MultiCoNER) , 2022, SEMEVAL.

[57]  Jing Li,et al.  A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check , 2018, EMNLP.