Automatic Dialect Identification: A Study of British English

This contribution deals with the automatic identification of the dialects of the British Isles. Several methods based on the linguistic study of dialect-specific vowel systems are proposed and compared using the Accents of the British Isles (ABI) corpus. The first method examines differences in diphthongization for the face lexical set. Discrimination scores in a two-dialect discrimination task range from chance to ca. 98% of correct decision depending on the pair of dialects under test. Thanks to the ACCDIST method (developed in [1,2]), the second and third experiments take dialectal differences in the structure of vowel systems into consideration; evaluation is performed on a 13-dialect closed set identification task. Correct identification reaches up to 90% with two subsets of the ABI corpus (/hVd/ set and read passages). All these experiments rely on a front-end automatic phonetic alignment and are therefore text-dependent. Results and possible improvements are discussed in the light of British dialectology.

[1]  Susanne Schötz,et al.  Acoustic Analysis of Adult Speaker Age , 2007, Speaker Classification.

[2]  J. D. A. Widdowson,et al.  The Linguistic Atlas of England , 1979 .

[3]  Qin Yan,et al.  A comparative analysis of UK and US English accents in recognition and synthesis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  W. J. Barry,et al.  An approach to the problem of regional accent in automatic speech recognition , 1989 .

[5]  Christian Müller Speaker Classification II, Selected Projects , 2007, Speaker Classification.

[6]  Bruce Southard The Linguistic Atlas of England. Ed. Harold Orton, Stewart Sanderson, and John Widdowson. London: Croom Helm, 1978. Unpaginated , 1981 .

[7]  Paul Boersma,et al.  Praat: doing phonetics by computer , 2003 .

[8]  Michael Jessen,et al.  Speaker Classification in Forensic Phonetics and Acoustics , 2007, Speaker Classification.

[9]  Vladimir Makarenkov,et al.  Optimal Variable Weighting for Ultrametric and Additive Trees and K-means Partitioning: Methods and Software , 2001, J. Classif..

[10]  Mark Huckvale ACCDIST: An Accent Similarity Metric for Accent Recognition and Diagnosis , 2007, Speaker Classification.

[11]  Jonas Beskow,et al.  Wavesurfer - an open source speech tool , 2000, INTERSPEECH.

[12]  John C. Wells,et al.  Accents of English , 1982 .

[13]  J. Gower,et al.  Metric and Euclidean properties of dissimilarity coefficients , 1986 .

[14]  R. L. Fletcher The British Isles , 1996 .

[15]  Christian A. Müller,et al.  A Study of Acoustic Correlates of Speaker Age , 2007, Speaker Classification.

[16]  Mark Huckvale,et al.  ACCDIST: a metric for comparing speakers' accents , 2004, INTERSPEECH.

[17]  John H. L. Hansen,et al.  Advances in word based dialect/accent classification , 2005, INTERSPEECH.

[18]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .