Improving the discrimination between native accents when recorded over different channels

Acoustic differences between native accents may prove to be too subtle for straightforward brute force techniques such as blindly clustered Gaussian mixture model (GMM) classifiers to yield satisfactory discrimination performance while these methods work well for classifying more pronounced differences such as language, gender or channel. In this paper it is shown that small channel differences are easier to detect by such coarse classifiers than native accent differences. Performance of native accent classification can be improved considerably by incorporating the knowledge of the underlying phoneme sequence and using phoneme specific GMMs. Further improvements are obtained if optimal feature selection is combined with the phoneme dependent GMMs, resulting in usage of less than 10% of the original features. The presented methods result in a reduction of more than 40% in relative error rate in a 5-class classification task.

[1]  Marc A. Zissman,et al.  Improving accent identification through knowledge of English syllable structure , 1998, ICSLP.

[2]  Dirk Van Compernolle,et al.  Flemish accent identification based on formant and duration features , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Chao Huang,et al.  Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition , 2000, INTERSPEECH.

[4]  Chao Huang,et al.  Automatic accent identification using Gaussian mixture models , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[5]  Dirk Van Compernolle,et al.  Fast and accurate acoustic modelling with semi-continuous HMMs , 1998, Speech Commun..