Gender and Animacy Knowledge Discovery from Web-Scale N-Grams for Unsupervised Person Mention Detection

In this paper we present a simple approach to discover gender and animacy knowledge for person mention detection. We learn noun-gender and noun-animacy pair counts from web-scale n-grams using specific lexical patterns, and then apply confidence estimation metrics to filter noise. The selected informative pairs are then used to detect person mentions from raw texts in an unsupervised learning framework. Experiments showed that this approach can achieve high performance comparable to state-of-the-art supervised learning methods which require manually annotated corpora and gazetteers.

[1]  Constantin Orasan,et al.  Improving anaphora resolution by identifying animate entities in texts , 2002 .

[2]  Eduard H. Hovy,et al.  Offline Strategies for Online Question Answering: Answering Questions Before They Are Asked , 2003, ACL.

[3]  John Hale,et al.  A Statistical Approach to Anaphora Resolution , 1998, VLC@COLING/ACL.

[4]  Eugene Charniak,et al.  Getting Useful Gender Statistics from English Text , 1998 .

[5]  Randy Goebel,et al.  Web-Scale N-gram Models for Lexical Disambiguation , 2009, IJCAI.

[6]  Shane Bergsma,et al.  Automatic Acquisition of Gender Information for Anaphora Resolution , 2005, Canadian Conference on AI.

[7]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[8]  Ralph Grishman,et al.  NYU's English ACE 2005 System Description , 2005 .

[9]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[10]  Carlo Cecchetto,et al.  Introduction to Government and Binding Theory , 1996 .

[11]  Imed Zitouni,et al.  Mention Detection Crossing the Language Barrier , 2008, EMNLP.

[12]  Randy Goebel,et al.  Glen, Glenda or Glendale: Unsupervised and Semi-supervised Learning of English Noun Gender , 2009, CoNLL.

[13]  Heng Ji,et al.  Using Semantic Relations to Refine Coreference Decisions , 2005, HLT.

[14]  Heng Ji,et al.  Data Selection in Semi-supervised Learning for Name Tagging , 2006 .

[15]  Liliane Haegeman,et al.  Introduction to Government and Binding Theory , 1991 .