Semantic Annotation of Transcribed Audio Broadcast News Using Contextual Features in Graphical Discriminative Models

In this paper we propose an efficient approach to perform named entities retrieval (NER) using their hierarchical structure in transcribed speech documents. The NER task consists of identifying and classifying every word in a document into some predefined categories such as person name, locations, organizations, and dates. Usually the classical NER systems use generative approaches to learn models considering only the words characteristics (word context). In this work we show that NER is also sensitive to syntactic and semantic contexts. For this reason, we introduce an extension of conditional random fields (CRFs) approach to consider multiple contexts. We present an adaptation of the text-approach to the automatic speech recognition (ASR) outputs. Experimental results show that the proposed approach outperformed a CRFs simple application. Our experiments are done using ESTER 2 campaign data. The proposed approach is ranked in 4th position in ESTER 2 participating sites, it achieves a significant relative improvement of 18% in slot rate error (SER) measure over HMMs method.

[1]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[3]  Dilek Z. Hakkani-Tür,et al.  Entropy Based Classifier Combination for Sentence Segmentation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Katsuhito Sudoh,et al.  Named Entity Recognition from Speech Using Discriminative Models and Speech Recognition Confidence , 2009, J. Inf. Process..

[5]  Virginia Teller Review of Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition by Daniel Jurafsky and James H. Martin. Prentice Hall 2000. , 2000 .

[6]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[7]  Guillaume Gravier,et al.  The ester 2 evaluation campaign for the rich transcription of French radio broadcasts , 2009, INTERSPEECH.

[8]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition using Support Vector Machine: A Language Independent Approach , 2010 .

[9]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[10]  Hervé Glotin,et al.  Structured Named Entity Retrieval in Audio Broadcast News , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.

[11]  Satoshi Sekine,et al.  Extended Named Entity Hierarchy , 2002, LREC.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Horacio Rodríguez,et al.  Part-of-Speech Tagging Using Decision Trees , 1998, ECML.