Experiments on speaker-independent phone recognition using BREF

A series of experiments for speaker-independent, continuous speech phone recognition have been carried out using the recently recorded BREF corpus. The authors' experiments were the first to use this database, and are meant to provide a baseline performance evaluation for vocabulary independent phone recognition. The system was trained using hand-verified data from 43 speakers. Using 35 context-dependent phone models, a baseline phone accuracy of 60% (no phone grammar) has been obtained on an independent test set of 7635 phone segments from 19 speakers. Including phone bigram probabilities as phonotactic constraints results in a performance of 63.3%. A phone accuracy of 68.6% (73.3% correct) was obtained with 428 context dependent models.<<ETX>>

[1]  Stephen E. Levinson,et al.  Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition , 1989, HLT.

[2]  Li Deng,et al.  Acoustic recognition component of an 86000-word speech recognizer , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  B. Merialdo,et al.  Phoneme classification using Markov models , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Mari Ostendorf,et al.  Fast Search Algorithms for Connected Phone Recognition Using the Stochastic Segment Model , 1990, HLT.

[5]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[6]  Maxine Eskénazi,et al.  Design considerations and text selection for BREF, a large French read-speech corpus , 1990, ICSLP.

[7]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[9]  L. R. Rabiner,et al.  Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[10]  B. Merialdo Phonetic recognition using hidden Markov models and maximum mutual information training , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..