论文信息 - Named Entities in Czech: Annotating Data and Developing NE Tagger

Named Entities in Czech: Annotating Data and Developing NE Tagger

This paper deals with the treatment of Named Entities (NEs) in Czech. We introduce a two-level NE classification. We have used this classification for manual annotation of two thousand sentences, gaining more than 11,000 NE instances. Employing the annotated data and Machine-Learning techniques (namely the top-down induction of decision trees), we have developed and evaluated a software system aimed at automatic detection and classification of NEs in Czech texts.

Zdenek Zabokrtský | Oldrich Kruza | Magda Sevcíková

[1] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[2] Nuno Seco,et al. HAREM: An Advanced NER Evaluation Contest for Portuguese , 2006, LREC.

[3] Ralph Grishman,et al. Message Understanding Conference- 6: A Brief History , 1996, COLING.

[4] Jan Hajic,et al. The Prague Dependency Treebank , 2003 .

[5] S. Sekine. Named Entity : History and Future , 2004 .

[6] Takehito Utsuro,et al. Named Entity Chunking Techniques in Supervised Learning for Japanese Named Entity Recognition , 2000, COLING.

[7] Eduard H. Hovy,et al. Fine Grained Classification of Named Entities , 2002, COLING.

[8] Thorsten Brants,et al. A Context Pattern Induction Method for Named Entity Extraction , 2006, CoNLL.