Measuring foreign accent strength in English : Validating Levenshtein distance as a measure

With an eye toward measuring the strengths of foreign accents in American English, we evaluate the suitability of a modified version of the Levenshtein distance (LD) for comparing (the phonetic transcriptions of) accented pronunciations. Although this measure has been used successfully inter alia to study the differences among dialect pronunciations, it has not been applied to study foreign accents. Here, we use it to compare the pronunciation of non-native English speakers to native American English speech. Our results indicate that the Levenshtein distance is a valid native-likeness measurement, as it correlates strongly with the average "native-like" judgments given by more than 1000 native American English raters (r = -0.8, p < 0.001).

[1]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[2]  Gerhard Jäger,et al.  Phylogenetic Inference from Word Lists Using Weighted Alignment with Empirically Determined Weights , 2013 .

[3]  Wilbert Heeringa,et al.  Measuring Dialect Differences , 2009 .

[4]  Søren Wichmann,et al.  Explorations in automated language classification , 2008 .

[5]  James Emil Flege,et al.  Factors affecting degree of foreign accent in an L2: a review , 2001, J. Phonetics.

[6]  John Nerbonne,et al.  Measuring Dialect Distance Phonetically , 1997, SIGMORPHON@EACL.

[7]  Brett Kessler,et al.  Book Reviews: The Significance of Word Lists , 2001, CL.

[8]  Nathan C. Sanders,et al.  Phonological Distance Measures* , 2009, J. Quant. Linguistics.

[9]  John Nerbonne,et al.  Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities and Education (LaTeCH - SHELT&R 2009) , 2009 .

[10]  Roy C. Major,et al.  English voiceless stop production by speakers of Brazilian Portuguese , 1987 .

[11]  Martijn Wieling,et al.  A quantitative approach to social and geographical dialect variation , 2012 .

[12]  Job Schepens,et al.  Distributions of cognates in Europe as based on Levenshtein distance* , 2008, Bilingualism: Language and Cognition.

[13]  John Laver,et al.  Principles of Phonetics: Principles of transcription , 1994 .

[14]  H. Magen The perception of foreign-accented speech , 1998 .

[15]  Tracey M. Derwing,et al.  Putting accent in its place: Rethinking obstacles to communication , 2008, Language Teaching.

[16]  Brett Kessler,et al.  Computational dialectology in Irish Gaelic , 1995, EACL.

[17]  W. Heeringa,et al.  Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data , 2004, Language Variation and Change.

[18]  T. Dijkstra,et al.  Distributions of cognates in Europe as based on Levenshtein distance* , 2008, Bilingualism: Language and Cognition.

[19]  Grzegorz Kondrak,et al.  Identification of Confusable Drug Names: A New Approach and Evaluation Methodology , 2004, COLING.

[20]  Steven H. Weinberger,et al.  The Speech Accent Archive: towards a typology of English accents , 2011 .

[21]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[22]  Warren Maguire,et al.  The sound patterns of Englishes: representing phonetic similarity , 2007, English Language and Linguistics.

[23]  J. Nerbonne,et al.  Inducing a measure of phonetic similarity from dialect variation , 2011 .

[24]  Robert DeKeyser,et al.  Age effects in second language learning , 2011 .

[25]  W. Heeringa,et al.  Evaluation of String Distance Algorithms for Dialectology , 2006 .

[26]  John Nerbonne,et al.  Multiple Sequence Alignments in Linguistics , 2009, LaTeCH - SHELT&R@EACL.

[27]  Johann-Mattis List,et al.  LexStat: Automatic Detection of Cognates in Multilingual Wordlists , 2012, EACL 2012.

[28]  J. Flege The detection of French accent by American listeners. , 1984, The Journal of the Acoustical Society of America.

[29]  Grzegorz Kondrak,et al.  Identifying Cognates by Phonetic and Semantic Similarity , 2001, NAACL.

[30]  Wilbert Heeringa,et al.  Phonetic and Lexical Predictors of Intelligibility , 2008, Int. J. Humanit. Arts Comput..

[31]  Søren Wichmann,et al.  The Emerging Field of Language Dynamics , 2008, Lang. Linguistics Compass.

[32]  Eileen M. Brennan,et al.  Measurements of accent and attitude toward Mexican-American speech , 1981 .

[33]  Elizabeth A. McCullough Acoustic correlates of perceived foreign accent in non-native English , 2013 .

[34]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[35]  R. Kalin,et al.  The Perception and Evaluation of Job Candidates with Four Different Ethnic Accents , 1980 .