In this paper, we propose a novel method for modelling native accented speech. As an alternative to the notion of dialect, we work with the lower level phonological components of accents, which we term accent features . This provides us with a better understanding of how pronunciation varies and it allows us to give a much more detailed picture of a person’s speech. The accent features are included during phonological adaptation of a speaker-independent Automatic Speech Recognition system in an attempt to make it more robust when exposed to pronunciation variation thus improving recognition performance on accented speech. We employ a dynamic set-up in which the system first identifies the phonetic characteristics of the user’s speech. It then creates a model of the speaker’s phonological system and adapts the pronunciation dictionary to best match his/her speech. Recognition is subsequently carried out using the adapted pronunciation dictionary. Experiments on British English speech data show a significant relative improvement in error rate of 20% compared with the traditional non-adaptive method.
[1]
Helmer Strik,et al.
Modeling pronunciation variation for ASR: A survey of the literature
,
1999,
Speech Commun..
[2]
Matthias Eichner,et al.
Measuring the Quality of Pronunciation Dictionaries
,
2002
.
[3]
Stephen Cox,et al.
A comparison of two unsupervised approaches to accent identification
,
1998,
ICSLP.
[4]
John C. Wells,et al.
Accents of English
,
1982
.
[5]
Helmer Strik,et al.
Modeling pronunciation variation for ASR: Comparing criteria for rule selection
,
2002
.
[6]
Philip C. Woodland,et al.
Using accent-specific pronunciation modelling for robust speech recognition
,
1996,
Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[7]
Simon King,et al.
The Keyword Lexicon - An accent-independent lexicon for automatic speech recognition
,
2003
.