论文信息 - Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers - 字舞流文

Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Most semi-supervised methods in Natural Language Processing capitalize on unannotated resources in a single language; however, information can be gained from using parallel resources in more than one language, since translations of the same utterance in different languages can help to disambiguate each other. We demonstrate a method that makes effective use of vast amounts of bilingual text (a.k.a. bitext) to improve monolingual systems. We propose a factored probabilistic sequence model that encourages both cross-language and intra-document consistency. A simple Gibbs sampling algorithm is introduced for performing approximate inference. Experiments on English-Chinese Named Entity Recognition (NER) using the OntoNotes dataset demonstrate that our method is significantly more accurate than state-of-the-art monolingual CRF models in a bilingual test setting. Our model also improves on previous work by Burkett et al. (2010), achieving a relative error reduction of 10.8% and 4.5% in Chinese and English, respectively. Furthermore, by annotating a moderate amount of unlabeled bi-text with our bilingual model, and using the tagged data for uptraining, we achieve a 9.2% error reduction in Chinese over the state-of-the-art Stanford monolingual NER system.

Wanxiang Che | Christopher D. Manning | Mengqiu Wang | Wanxiang Che | Mengqiu Wang

[1] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[2] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[3] Dan Klein,et al. Learning Better Monolingual Models with Unannotated Bilingual Text , 2010, CoNLL.

[4] Andrew McCallum,et al. Collective Segmentation and Labeling of Distant Entities in Information Extraction , 2004 .

[5] Donald Geman,et al. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[6] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[7] Mitchell P. Marcus,et al. OntoNotes: The 90% Solution , 2006, NAACL.

[8] Alexander M. Rush,et al. Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints , 2012, EMNLP-CoNLL.

[9] Heng Ji,et al. Joint bilingual name tagging for parallel corpora , 2012, CIKM '12.

[10] Ting Liu,et al. Generating Chinese Named Entity Data from a Parallel Corpus , 2011, IJCNLP.

[11] Ben Taskar,et al. Alignment by Agreement , 2006, NAACL.

[12] Ellen Riloff,et al. Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[13] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[14] Slav Petrov,et al. Uptraining for Accurate Deterministic Question Parsing , 2010, EMNLP.

[15] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[16] Kristina Toutanova,et al. Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia , 2012, ACL.

[17] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .

[18] David Yarowsky,et al. Inducing Multilingual POS Taggers and NP Bracketers via Robust Projection Across Aligned Corpora , 2001, NAACL.

[19] Tong Zhang,et al. A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[20] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[21] Ben Taskar,et al. Multi-View Learning over Structured and Non-Identical Outputs , 2008, UAI.

[22] Razvan C. Bunescu,et al. Collective Information Extraction with Relational Markov Networks , 2004, ACL.

[23] Ting Liu,et al. Generating Chinese named entity data from parallel corpora , 2014, Frontiers of Computer Science.

[24] Slav Petrov,et al. Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections , 2011, ACL.