论文信息 - Pattern Mining for Named Entity Recognition

Pattern Mining for Named Entity Recognition

Many evaluation campaigns have shown that knowledge-based and data-driven approaches remain equally competitive for Named Entity Recognition. Our research team has developed CasEN, a symbolic system based on finite state transducers, which achieved promising results during the Ester2 French-speaking evaluation campaign. Despite these encouraging results, manually extending the coverage of such a hand-crafted system is a difficult task. In this paper, we present a novel approach based on pattern mining for NER and to supplement our system’s knowledge base. The system, mXS, exhaustively searches for hierarchical sequential patterns, that aim at detecting Named Entity boundaries. We assess their efficiency by using such patterns in a standalone mode and in combination with our existing system.

Damien Nouvel | Jean-Yves Antoine | Nathalie Friburger

[1] Marc Moens,et al. Named Entity Recognition without Gazetteers , 1999, EACL.

[2] David D. McDonald. Internal and External Evidence in the Identification and Semantic Categorization of Proper Names , 1993 .

[3] Ralph Grishman,et al. Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition , 1998, VLC@COLING/ACL.

[4] Jean-Yves Antoine,et al. Recognizing Named Entities using Automatically Extracted Transduction Rules , 2011, LTC 2011.

[5] Damien Nouvel,et al. An Analysis of the Performances of the CasEN Named Entities Recognition System in the Ester2 Evaluation Campaign , 2010, LREC.

[6] Olivier Galibert,et al. Structured and Extended Named Entity Evaluation in Automatic Speech Transcriptions , 2011, IJCNLP.

[7] Denis Maurel,et al. Finite-state transducer cascades to extract named entities in texts , 2004, Theor. Comput. Sci..

[8] Helmut Schmidt,et al. Probabilistic part-of-speech tagging using decision trees , 1994 .

[9] Doug Downey,et al. Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[10] Denis Maurel,et al. Prolexbase et LMF: vers un standard pour les ressources lexicales sur les noms propres , 2008 .

[11] Elaine Marsh,et al. MUC-7 Evaluation of IE Technology: Overview of Results , 1998, MUC.