Tackling Code-Switched NER: Participation of CMU

Named Entity Recognition plays a major role in several downstream applications in NLP. Though this task has been heavily studied in formal monolingual texts and also noisy texts like Twitter data, it is still an emerging task in code-switched (CS) content on social media. This paper describes our participation in the shared task of NER on code-switched data for Spanglish (Spanish + English) and Arabish (Arabic + English). In this paper we describe models that intuitively developed from the data for the shared task Named Entity Recognition on Code-switched Data. Owing to the sparse and non-linear relationships between words in Twitter data, we explored neural architectures that are capable of non-linearities fairly well. In specific, we trained character level models and word level models based on Bidirectional LSTMs (Bi-LSTMs) to perform sequential tagging. We train multiple models to identify nominal mentions and subsequently use this information to predict the labels of named entity in a sequence. Our best model is a character level model along with word level pre-trained multilingual embeddings that gave an F-score of 56.72 in Spanglish and a word level model that gave an F-score of 65.02 in Arabish on the test data.

[1]  Jaime G. Carbonell,et al.  Phonologically Aware Neural Model for Named Entity Recognition in Low Resource Transfer Settings , 2016, EMNLP.

[2]  Hongfei Lin,et al.  An attention‐based BiLSTM‐CRF approach to document‐level chemical named entity recognition , 2018, Bioinform..

[3]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[4]  Xavier Carreras,et al.  Named Entity Extraction using AdaBoost , 2002, CoNLL.

[5]  Ngoc Thanh Nguyen,et al.  A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields , 2017, Knowl. Based Syst..

[6]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[7]  Julia Hirschberg,et al.  Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task , 2018, CodeSwitch@ACL.

[8]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[9]  David Nadeau,et al.  Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision , 2007 .

[10]  Ryan Cotterell,et al.  Nerit: Named Entity Recognition for Informal Text , 2013 .

[11]  Oren Etzioni,et al.  Named entity recognition in tweets , 2011, EMNLP 2011.

[12]  Thamar Solorio,et al.  A Multi-task Approach for Named Entity Recognition in Social Media Data , 2017, NUT@EMNLP.

[13]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[14]  Khaled Shaalan,et al.  A Survey of Arabic Named Entity Recognition and Classification , 2014, CL.

[15]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[16]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[17]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[18]  Mona T. Diab,et al.  Named Entity Recognition for Arabic Social Media , 2015, VS@HLT-NAACL.

[19]  Goran Glavas,et al.  Spanish NER with Word Representations and Conditional Random Fields , 2016, NEWS@ACM.

[20]  Stephen D. Mayhew,et al.  Illinois CCG LoReHLT 2016 named entity recognition and situation frame systems , 2017, Machine Translation.

[21]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[22]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[23]  Montse Cuadros,et al.  NERC-fr: Supervised Named Entity Recognition for French , 2014, TSD.

[24]  Jürgen Schmidhuber,et al.  Learning to forget: continual prediction with LSTM , 1999 .

[25]  Yanjun Qi,et al.  Combining labeled and unlabeled data with word-class distribution learning , 2009, CIKM.

[26]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.