Named Entity Recognition from Speech Using Discriminative Models and Speech Recognition Confidence

This paper proposes a discriminative named entity recognition (NER) method from automatic speech recognition (ASR) results. The proposed method uses the confidence of the ASR result as a feature that represents whether each word has been correctly recognized. Consequently, it provides robust NER for the noisy input caused by ASR errors. The NER model is trained using ASR results and reference transcriptions with named entity (NE) annotation. Experimental results using support vector machines (SVMs) and speech data from Japanese newspaper articles show that the proposed method outperformed a simple application of text-based NER to the ASR results, especially in terms of improving precision.

[1]  Timothy J. Hazen,et al.  Word and phone level acoustic confidence scoring , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2]  Ralph Weischedel,et al.  Named Entity Extraction from Broadcast News , 1999 .

[3]  Ralph Grishman,et al.  A Decision Tree Method for Finding and Classifying Names in Japanese Texts , 1998, VLC@COLING/ACL.

[4]  Satoshi Sekine,et al.  Japanese Named Entity Extraction Evaluation - Analysis of Results - , 2000, COLING.

[5]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[6]  F. K. Soong Generalized word posterior probability (GWPP) for measuring reliability of recognized words , 2004 .

[7]  Rong Zhang,et al.  Word level confidence annotation using combinations of features , 2001, INTERSPEECH.

[8]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[9]  Hwee Tou Ng,et al.  Named Entity Recognition with a Maximum Entropy Approach , 2003, CoNLL.

[10]  Frédéric Béchet,et al.  Robust Named Entity Extraction from Large Spoken Archives , 2005, HLT/EMNLP.

[11]  Pascale Fung,et al.  Using N-best lists for Named Entity Recognition from Chinese Speech , 2004, NAACL.

[12]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[13]  Takaaki Hori,et al.  Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition , 2004, INTERSPEECH.

[14]  Mari Ostendorf,et al.  Improving Information Extraction by Modeling Errors in Speech Recognizer Output , 2001, HLT.

[15]  Hideki Isozaki,et al.  Efficient Support Vector Classifiers for Named Entity Recognition , 2002, COLING.

[16]  Simon King,et al.  Named entity extraction from word lattices , 2003, INTERSPEECH.

[17]  Thomas Schaaf,et al.  Confidence measures for spontaneous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Dilek Z. Hakkani-Tür,et al.  Detecting and extracting named entities from spontaneous speech in a mixed-initiative spoken dialogue context: How May I Help You?sm, tm , 2004, Speech Commun..

[19]  Mari Ostendorf,et al.  Robust information extraction from automatically generated speech transcriptions , 2000, Speech Commun..

[20]  Simon King,et al.  Discriminative methods for improving named entity extraction on speech data , 2003, INTERSPEECH.