论文信息 - A survey of named entity recognition and classification

A survey of named entity recognition and classification

This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.

Satoshi Sekine | David Nadeau | S. Sekine | David Nadeau

[1] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[2] Nina Wacholder,et al. Extracting Names from Natural-Language Text , 2000 .

[3] A. Waibel,et al. Multilingual named entity extraction and translation from text and speech , 2006 .

[4] Mark A. Przybocki,et al. The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[5] Fabio Rinaldi,et al. FACILE: Description of the NE System Used for MUC-7 , 1998, MUC.

[6] Burr Settles,et al. Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets , 2004, NLPBA/BioNLP.

[7] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[8] Jun'ichi Tsujii,et al. Boosting Precision and Recall of Dictionary-Based Protein Name Recognition , 2003, BioNLP@ACL.

[9] Jon Patrick,et al. SLINERC: The Sydney Language-Independent Named Entity Recogniser and Classifier , 2002, CoNLL.

[10] Martin Jansche. Named Entity Extraction with Conditional Markov Models and Classifiers , 2002, CoNLL.

[11] Ellen Riloff,et al. Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[12] Nancy Chinchor,et al. Overview of MUC-7 , 1998, MUC.

[13] Ralph Grishman,et al. NYU: Description of the MENE Named Entity System as Used in MUC-7 , 1998, MUC.

[14] Roberto Basili,et al. RitroveRAI: A Web Application for Semantic Indexing and Hyperlinking of Multimedia News , 2005, SEMWEB.

[15] Michael Fleischman. Automated Subcategorization of Named Entities , 2001, ACL.

[16] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[17] Maria Liakata,et al. A System for Recognition of Named Entities in Greek , 2000, Natural Language Processing.

[18] Wei Li,et al. Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[19] Eduard H. Hovy,et al. Fine Grained Classification of Named Entities , 2002, COLING.

[20] Frantz Vichot,et al. Automatic Processing of Proper Names in Texts , 1995, EACL.

[21] Jian Su,et al. Effective Adaptation of Hidden Markov Model-based Named Entity Recognizer for Biomedical Domain , 2003, BioNLP@ACL.

[22] Dan Roth,et al. Identification and Tracing of Ambiguous Names: Discriminative and Generative Approaches , 2004, AAAI.

[23] K. E. Ravikumar,et al. A Biological Named Entity Recognizer , 2002, Pacific Symposium on Biocomputing.

[24] William W. Cohen,et al. Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text , 2005, HLT.

[25] Paola Velardi,et al. Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence , 2001, CL.

[26] Sergey Brin,et al. Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[27] J. Altham. Naming and necessity. , 1981 .

[28] Jakub Piskorski,et al. Extraction of Polish Named-Entities , 2004, LREC.

[29] Yorick Wilks,et al. University of Sheffield: description of the LaSIE system as used for MUC-6 , 1995, MUC.

[30] Thomas C. Rindflesch,et al. EDGAR: extraction of drugs, genes and relations from the biomedical literature. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[31] Shih-Hung Wu,et al. Various criteria in the evaluation of biomedical named entity recognition , 2006, BMC Bioinformatics.

[32] William W. Cohen,et al. Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods , 2004, KDD.

[33] David D. Palmer,et al. A Statistical Profile of the Named Entity Task , 1997, ANLP.

[34] Chao-Huang Chang,et al. Recognizing Unregistered Names for Mandarin Word Identification , 1992, COLING.

[35] Christine Thielen,et al. An Approach to Proper Name Tagging for German , 1995, cmp-lg/9506024.

[36] Jon Patrick,et al. Evaluating Corpora for Named Entity Recognition Using Character-Level Features , 2003, Australian Conference on Artificial Intelligence.

[37] Jeffrey P. Bigham,et al. Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge , 2006, AAAI.

[38] David D. McDonald. Internal and External Evidence in the Identification and Semantic Categorization of Proper Names , 1993 .

[39] Diana Maynard,et al. Creation of Reusable Components and Language Resources for Named Entity Recognition in Russian , 2004, LREC.

[40] David Yarowsky,et al. Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence , 1999, EMNLP.

[41] Dimitrios Kokkinakis,et al. AVENTINUS, GATE and Swedish Lingware , 1998, NODALIDA.

[42] Marc Moens,et al. Named Entity Recognition without Gazetteers , 1999, EACL.

[43] L. F. Rau,et al. Extracting company names from text , 1991, [1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application.

[44] Yuji Matsumoto,et al. Japanese Named Entity Extraction with Redundant Morphological Analysis , 2003, NAACL.

[45] Dekang Lin,et al. Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[46] Doug Downey,et al. Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[47] Xavier Carreras,et al. Named Entity Recognition For Catalan Using Only Spanish Resources and Unlabelled Data , 2003, EACL.

[48] Nuno Seco,et al. HAREM: An Advanced NER Evaluation Contest for Portuguese , 2006, LREC.

[49] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[50] Ralph Grishman,et al. Unsupervised Learning of Generalized Names , 2002, COLING.

[51] Hsin-Hsi Chen,et al. Identification and Classification of Proper Nouns in Chinese Texts , 1996, COLING.

[52] Eckhard Bick. A Named Entity Recognizer for Danish , 2004, LREC.

[53] Yorick Wilks,et al. Named Entity Recognition from Diverse Text Types , 2001 .

[54] Richard M. Schwartz,et al. Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[55] Georgios Paliouras,et al. Using Machine Learning to Maintain Rule-based Named-Entity Recognition and Classification Systems , 2001, ACL.