Whereas nowadays within-word co-articulation effects are usually sufficiently dealt with in automatic speech recognition, this is not always the case with phrase level co-articulation effects (PLC). This paper describes a first approach in dealing with phrase level co-articulation by applying these rules on the reference transcripts used for training our recogniser and by adding a set of temporary PLC phones that later on will be mapped on the original phones. In fact we temporarily break down acoustic context into a general and a PLC context. With this method, more robust models could be trained because phones that are confused due to PLC effects like for example /v/-/f/ and /z/-/s/, receive their own models. A first attempt to apply this method is described.
[1]
Helmer Strik,et al.
Improving the performance of a Dutch CSR by modeling pronunciation variation
,
1998
.
[2]
A.P.J. van den Bosch,et al.
Learning to pronounce written words : a study in inductive language learning
,
1997
.
[3]
Hervé Bourlard,et al.
Connectionist Speech Recognition: A Hybrid Approach
,
1993
.
[4]
Steve Renals,et al.
THE USE OF RECURRENT NEURAL NETWORKS IN CONTINUOUS SPEECH RECOGNITION
,
1996
.
[5]
Wessel Kraaij,et al.
Phoneme based spoken document retrieval
,
1998
.