Named entity recognition from spontaneous open-domain speech
暂无分享,去创建一个
This paper presents an analysis of named entity recognition and classification in spontaneous speech transcripts. We annotated a significant fraction of the Switchboard corpus with six named entity classes and investigated a battery of machine learning models that include lexical, syntactic, and semantic attributes. The best recognition and classification model obtains promising results, approaching within 5% a system evaluated on clean textual data.
[1] Thorsten Brants,et al. TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.
[2] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.
[3] Mari Ostendorf,et al. INFORMATION EXTRACTION FROM BROADCAST NEWS SPEECH DATA , 1999 .
[4] Yuji Matsumoto,et al. Fast Methods for Kernel-Based Text Analysis , 2003, ACL.
[5] Ralph Weischedel,et al. NAMED ENTITY EXTRACTION FROM SPEECH , 1998 .