Cross-entropic comparison of formants of British, Australian and American English accents

This paper highlights the differences in spectral features between British, Australian and American English accents and applies the cross-entropy information measure for comparative quantification of the impacts of the variations of accents, speaker groups and recordings on the probability models of spectral features of phonetic units of speech. Comparison of the cross-entropies of formants and cepstrum features indicates that formants are a better indicator of accents. In particular it appears that the measurements of differences in formants across accents are less sensitive to different recordings or databases compared to cepstrum features. It is found that the cross-entropies of the same phonemes across speaker groups with different accents (inter-accent distances) are significantly greater than the cross-entropies of the same phonemes across speaker groups of the same accent (intra-accent distances). Comparative evaluations presented on cross-gender speech recognition shows that accent differences have an impact comparable to gender differences. The cross-entropy measure is also used to construct cross-accent phonetic-trees, which serve to show the structural similarities and differences of the phonetic systems across accents.

[1]  Corey Miller,et al.  Pronunciation modeling in speech synthesis , 1998 .

[2]  Barbara M. Horvath,et al.  Variation in Australian English , 1985 .

[3]  A. Botinis,et al.  Intonation , 2001, Speech Commun..

[4]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[5]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[6]  D G Childers,et al.  Gender recognition from speech. Part II: Fine analysis. , 1991, The Journal of the Acoustical Society of America.

[7]  Philippe Boula de Mareüil,et al.  Identification of regional accents in French: perception and categorization , 2006, INTERSPEECH.

[8]  C. Espy-Wilson,et al.  Coarticulatory stability in American English /r/ , 1997 .

[9]  W. Labov,et al.  The Atlas Of North American English , 2005 .

[10]  Arthur Delbridge,et al.  The speech of Australian adolescents : a survey , 1965 .

[11]  Joachim Köhler,et al.  Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12]  Qin Yan,et al.  A comparative analysis of UK and US English accents in recognition and synthesis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Douglas D. O'Shaughnessy,et al.  Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[14]  S. S. Stevens,et al.  Critical Band Width in Loudness Summation , 1957 .

[15]  Saeed Vaseghi,et al.  Analysis of acoustic correlates of British, Australian and American accents , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[16]  R. Johnson,et al.  Properties of cross-entropy minimization , 1981, IEEE Trans. Inf. Theory.

[17]  Samy Bengio,et al.  HMM2- Extraction of Formant Features and their Use for Robust ASR , 2001 .

[18]  R. van Bezooijen,et al.  Identification of Language Varieties , 1999 .

[19]  Jonathan Harrington,et al.  An acoustic comparison between New Zealand and Australian English vowels , 1998 .

[20]  Janet Fletcher,et al.  Intonational Variation in Four Dialects of English: The High Rising Tune , 2010 .

[21]  Rajend Mesthrie,et al.  A Handbook of Varieties of English , 2004 .

[22]  John C. Wells,et al.  Accents of English , 1982 .

[23]  Wonyong Sung,et al.  Vowel Pronunciation Accuracy Checking System Based on Phoneme Segmentation and Formants Extraction , 2003 .

[24]  Raven I. McDavid,et al.  The Speech of Australian Adolescents , 1977 .

[25]  Carol Y. Espy-Wilson,et al.  Coarticulatory stability in American English /r/ , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  E. Jaynes On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.

[28]  Mark Huckvale,et al.  ACCDIST: a metric for comparing speakers' accents , 2004, INTERSPEECH.

[29]  J. Hansen,et al.  A STUDY OF TEMPORAL FEATURES AND FREQUENCY CHARACTERISTICS IN AMERICAN ENGLISH FOREIGN ACCENT , 1997 .

[30]  Philippe Boula de Mareüil,et al.  The Contribution of Prosody to the Perception of Foreign Accent , 2006, Phonetica.

[31]  J. Harrington,et al.  An acoustic phonetic study of broad, general, and cultivated Australian English vowels* , 1997 .

[32]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[33]  Ben P. Milner,et al.  A comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application , 2007, INTERSPEECH.

[34]  Saeed Vaseghi,et al.  MAP prediction of formant frequencies and voicing class from MFCC vectors in noise , 2006, Speech Commun..

[35]  Qin Yan,et al.  Formant tracking linear prediction model using HMMs and Kalman filters for noisy speech processing , 2007, Comput. Speech Lang..

[36]  Louis ten Bosch,et al.  ASR, dialects, and acoustic/phonological distances , 2000, INTERSPEECH.

[37]  John H. L. Hansen,et al.  Dialect analysis and modeling for automatic classification , 2004, INTERSPEECH.

[38]  Samy Bengio,et al.  HMM2- extraction of formant structures and their use for robust ASR , 2001, INTERSPEECH.

[39]  F. Milinazzo,et al.  Formant location from LPC analysis data , 1993, IEEE Trans. Speech Audio Process..

[40]  John H. L. Hansen,et al.  The role of prosody in the perception of US native English accents , 2006, INTERSPEECH.

[41]  David Crystal,et al.  A dictionary of linguistics and phonetics , 1997 .

[42]  W. Labov Principles Of Linguistic Change , 1994 .