SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media

In this paper, we present the main findings and compare the results of SemEval-2020 Task 10, Emphasis Selection for Written Text in Visual Media. The goal of this shared task is to design automatic methods for emphasis selection, i.e. choosing candidates for emphasis in textual content to enable automated design assistance in authoring. The main focus is on short text instances for social media, with a variety of examples, from social media posts to inspirational quotes. Participants were asked to model emphasis using plain text with no additional context from the user or other design considerations. SemEval-2020 Emphasis Selection shared task attracted 197 participants in the early phase and a total of 31 teams made submissions to this task. The highest-ranked submission achieved 0.823 Matchm score. The analysis of systems submitted to the task indicates that BERT and RoBERTa were the most common choice of pre-trained models used, and part of speech tag (POS) was the most useful feature. Full results can be found on the task's website.

[1]  Gitte Lindgaard,et al.  Attention web designers: You have 50 milliseconds to make a good first impression! , 2006, Behav. Inf. Technol..

[2]  Yu Sun,et al.  ERNIE at SemEval-2020 Task 10: Learning Word Emphasis Selection by Pre-trained Language Model , 2020, SEMEVAL.

[3]  Naoaki Okazaki,et al.  TextLearner at SemEval-2020 Task 10: A Contextualized Ranking System in Solving Emphasis Selection in Text , 2020, SemEval@COLING.

[4]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[5]  Jianping Shen,et al.  FPAI at SemEval-2020 Task 10: A Query Enhanced Model with RoBERTa for Emphasis Selection , 2020, SemEval@COLING.

[6]  Xuanjing Huang,et al.  Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter , 2016, EMNLP.

[7]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[8]  Rajiv Ratn Shah,et al.  MIDAS at SemEval-2020 Task 10: Emphasis Selection using Label Distribution Learning and Contextual Embeddings , 2020, SemEval@COLING.

[9]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[10]  David Konopnicki,et al.  Word Emphasis Prediction for Expressive Text to Speech , 2018, INTERSPEECH.

[11]  Denis Gordeev,et al.  Randomseed19 at SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media , 2020, SemEval@COLING.

[12]  Hao Tian,et al.  ERNIE 2.0: A Continual Pre-training Framework for Language Understanding , 2019, AAAI.

[13]  Peter D. Turney Learning to Extract Keyphrases from Text , 2002, ArXiv.

[14]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[15]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[16]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[17]  Kevin Glocker,et al.  TëXtmarkers at SemEval-2020 Task 10: Emphasis Selection with Agreement Dependent Crowd Layers , 2020, SEMEVAL.

[18]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[19]  Hiroaki Ozaki,et al.  Hitachi at SemEval-2020 Task 10: Emphasis Distribution Fusion on Fine-Tuned Language Models , 2020, SemEval@COLING.

[20]  Isabelle Augenstein,et al.  SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications , 2017, *SEMEVAL.

[21]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[22]  Xiaojun Wan,et al.  Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[23]  Sriparna Saha,et al.  EL-BERT at SemEval-2020 Task 10: A Multi-Embedding Ensemble Based Approach for Emphasis Selection in Visual Media , 2020, SemEval@COLING.

[24]  Rishabh Agarwal,et al.  IITK at SemEval-2020 Task 10: Transformers for Emphasis Selection , 2020, SemEval@COLING.

[25]  Abdelghani Bellaachia,et al.  NE-Rank: A Novel Graph-Based Keyphrase Extraction in Twitter , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[26]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[27]  Francisco C. Pereira,et al.  Deep learning from crowds , 2017, AAAI.

[28]  Yves Bestgen LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM , 2020, SemEval@COLING.

[29]  Min-Yen Kan,et al.  Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[30]  Franck Dernoncourt,et al.  Learning Emphasis Selection for Written Text in Visual Media from Crowd-Sourced Label Distributions , 2019, ACL.

[31]  Natalie Parde,et al.  UIC-NLP at SemEval-2020 Task 10: Exploring an Alternate Perspective on Evaluation , 2020, SemEval@COLING.

[32]  Xuejie Zhang,et al.  YNU-HPCC at SemEval-2020 Task 10: Using a Multi-granularity Ordinal Classification of the BiLSTM Model for Emphasis Selection , 2020, SemEval@COLING.

[33]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[34]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[35]  Sang-goo Lee,et al.  IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize? , 2020, SemEval@COLING.

[36]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[37]  Hideyuki Mizuno,et al.  Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis , 2014, PACLIC.