论文信息 - Acoustics-based phonetic transcription method for proper nouns

Acoustics-based phonetic transcription method for proper nouns

This paper focuses on an approach to improve automatic phonetic transcription of proper nouns. The method is based on a two-level iterative process that extract the phonetic variants from the audio signals before filtering the irrelevant variants. The evaluation of the method shows a decreasing of the Word Error Rate (WER) on segments of speech with proper nouns, without affecting negatively the WER on the rest of the corpus (ESTER corpus of French broadcast news). IndexTerms: Speechrecognition, Phonetictranscription, Proper nouns

Paul Deléglise | Sylvain Meignier | Antoine Laurent | Téva Merlin

[1] J. Tihoni,et al. Phonotypical transcription through the GEPH expert system , 1991, EUROSPEECH.

[2] Mark A. Randolph,et al. An approach to automatic phonetic baseform generation based on Bayesian networks , 2001, INTERSPEECH.

[3] Frank K. Soong,et al. Optimizing baseforms for HMM-based speech recognition , 1995, EUROSPEECH.

[4] Guy Perennou,et al. BDLEX: a lexicon for spoken and written french , 1998, LREC.

[5] William J. Byrne,et al. Pronunciation modelling using a hand-labelled corpus for conversational speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6] Paul Deléglise,et al. Grapheme to phoneme conversion using an SMT system , 2009, INTERSPEECH.

[7] James F. Allen,et al. Bi-directional conversion between graphemes and phonemes using a joint N-gram model , 2001, SSW.

[8] F. Béchet. LIA―PHON: Un système complet de phonétisation de textes , 2001 .

[9] Sabine Deligne,et al. On the use of lattices for the automatic generation of pronunciations , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10] Hermann Ney,et al. Breadth-first search for finding the optimal phonetic transcription from multiple utterances , 2001, INTERSPEECH.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Guillaume Gravier,et al. The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.

[13] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[14] Paul Deléglise,et al. Iterative filtering of phonetic transcriptions of proper nouns , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15] Søren Riis,et al. Self-organizing letter code-book for text-to-phoneme neural network model , 2000, INTERSPEECH.

[16] Jerome R. Bellegarda. Unsupervised, language-independent grapheme-to-phoneme conversion by latent analogy , 2005, Speech Commun..

[17] Kari Torkkola. An efficient way to learn English grapheme-to-phoneme rules automatically , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.