Named Entity Recognition with Bilingual Constraints

Different languages contain complementary cues about entities, which can be used to improve Named Entity Recognition (NER) systems. We propose a method that formulates the problem of exploring such signals on unannotated bilingual text as a simple Integer Linear Program, which encourages entity tags to agree via bilingual constraints. Bilingual NER experiments on the large OntoNotes 4.0 Chinese-English corpus show that the proposed method can improve strong baselines for both Chinese and English. In particular, Chinese performance improves by over 5% absolute F1 score. We can then annotate a large amount of bilingual text (80k sentence pairs) using our method, and add it as uptraining data to the original monolingual NER training corpus. The Chinese model retrained on this new combined dataset outperforms the strong baseline by over 3% F1 score.

[1]  Scott Miller,et al.  Name Tagging with Word Clusters and Discriminative Training , 2004, NAACL.

[2]  Dan Roth,et al.  Integer linear programming inference for conditional random fields , 2005, ICML.

[3]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  Dan Klein,et al.  Learning Better Monolingual Models with Unannotated Bilingual Text , 2010, CoNLL.

[6]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[7]  Qun Liu,et al.  Bilingually-Constrained (Monolingual) Shift-Reduce Parsing , 2009, EMNLP.

[8]  David Yarowsky,et al.  Inducing Multilingual POS Taggers and NP Bracketers via Robust Projection Across Aligned Corpora , 2001, NAACL.

[9]  Slav Petrov,et al.  Uptraining for Accurate Deterministic Question Parsing , 2010, EMNLP.

[10]  Eric P. Xing,et al.  Concise Integer Linear Programming Formulations for Dependency Parsing , 2009, ACL.

[11]  Ting Liu,et al.  Generating Chinese Named Entity Data from a Parallel Corpus , 2011, IJCNLP.

[12]  Percy Liang,et al.  Semi-Supervised Learning for Natural Language , 2005 .

[13]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[14]  Stephan Vogel,et al.  Improved named entity translation and bilingual named entity extraction , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[15]  Ting Liu,et al.  Generating Chinese named entity data from parallel corpora , 2014, Frontiers of Computer Science.

[16]  Slav Petrov,et al.  Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections , 2011, ACL.

[17]  Dan Roth,et al.  Semantic Role Labeling Via Integer Linear Programming Inference , 2004, COLING.

[18]  William Byrne,et al.  Minimum bayes-risk techniques in automatic speech recognition and statistical machine translation , 2005 .

[19]  Ido Dagan,et al.  Global Learning of Typed Entailment Rules , 2011, ACL.

[20]  Kristina Toutanova,et al.  Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia , 2012, ACL.

[21]  Chengqing Zong,et al.  On Jointly Recognizing and Aligning Bilingual Named Entities , 2010, ACL.

[22]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[23]  Heng Ji,et al.  Joint bilingual name tagging for parallel corpora , 2012, CIKM '12.

[24]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[25]  Chengqing Zong,et al.  Joint Inference for Bilingual Semantic Role Labeling , 2010, EMNLP.

[26]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.