Phonological Distance Measures*

Abstract Phonological distance can be measured computationally using formally specified algorithms. This work investigates two such measures, one developed by Nerbonne and Heeringa (1997) based on Levenshtein distance (Levenshtein, 1965) and the other an adaptation of Dunning's (1994) language classifier that uses maximum likelihood distance. These two measures are compared against naïve transcriptions of the speech of paediatric cochlear implant users. The new measure, maximum likelihood distance, correlates highly with Levenshtein distance and naïve transcriptions; results from this corpus are easier to obtain since cochlear implant speech has a lower intelligibility than the usually high intelligibility of the speech of a different dialect.

[1]  J. Chambers,et al.  Dialectology: MECHANISMS OF VARIATION , 1998 .

[2]  W. Heeringa,et al.  Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data , 2004, Language Variation and Change.

[3]  John Nerbonne,et al.  Measuring Dialect Distance Phonetically , 1997, SIGMORPHON@EACL.

[4]  Graeme Hirst,et al.  Algorithms for language reconstruction , 2002 .

[5]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[6]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[7]  Ted E. Dunning,et al.  Statistical Identification of Language , 1994 .

[8]  René Charbonneau,et al.  Die Berechnung der phonetischen Variabilität: ein Beitrag zum objektiven Vergleich phonetischer Texte , 1972 .

[9]  D. Sherman,et al.  Review of Goldman-Fristoe Test of Articulation. , 1970 .

[10]  Hans Goebl,et al.  Recent Advances in Salzburg Dialectometry , 2006, Lit. Linguistic Comput..

[11]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[12]  Brett Kessler,et al.  Computational dialectology in Irish Gaelic , 1995, EACL.

[13]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[14]  W. Heeringa,et al.  Computational Comparison and Classification of Dialects , 2001 .

[15]  Hans Goebl Regards dialectométriques sur les donées de l'"Atlas linguistique de la France" (ALF): Relations quantitatives et structures de profondeur , 2003 .

[16]  Wilbert Jan Heeringa Measuring dialect pronunciation differences using Levenshtein distance , 2004 .

[17]  Steven B Chin,et al.  Children's consonant inventories after extended cochlear implant use. , 2003, Journal of speech, language, and hearing research : JSLHR.