Combining rule-based and statistical mechanisms for low-resource named entity recognition

We describe a multifaceted approach to named entity recognition that can be deployed with minimal data resources and a handful of hours of non-expert annotation. We describe how this approach was applied in the 2016 LoReHLT evaluation and demonstrate that both statistical and rule-based approaches contribute to our performance. We also demonstrate across many languages the value of selecting the sentences to be annotated when training on small amounts of data.

[1]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[4]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[5]  Heng Ji,et al.  Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning , 2016, HLT-NAACL.

[6]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[7]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 shared task , 2003 .

[8]  Kashif Riaz,et al.  Rule-Based Named Entity Recognition in Urdu , 2010, NEWS@ACL.

[9]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[10]  Wei Li,et al.  Rapid development of Hindi named entity recognition using conditional random fields and feature induction , 2003, TALIP.

[11]  Beth M. Sundheim,et al.  Overview of Results of the MUC-6 Evaluation , 1995, MUC.

[12]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[13]  Daniele Bonadiman,et al.  Deep Neural Networks for Named Entity Recognition in Italian , 2015 .

[14]  Ralph Grishman,et al.  Domain Adaptation with Active Learning for Named Entity Recognition , 2016, ICCCS.

[15]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[16]  Charles Yang,et al.  Unsupervised Morphology Learning with Statistical Paradigms , 2018, COLING.