Gooskens (2003) described an experiment which determined linguistic distances between 15 Norwegian dialects as perceived by Norwegian listeners. The results are compared toLevenshtein distances, calculated on the basis of transcriptions (of the words) of the same recordings as used in the perception experiment. The Levenshtein distance is equal to the sum of the weights of the insertions,deletions and substitutions needed to change one pronunciation into another. The success of the method depends on the reliability of the transcriber.The aim of this paper is to find an acoustic distance measure between dialects which approximates perceptual distance measure. We use andcompare different representations of the acoustic signal: Barkfilter spectrograms, cochleagrams and formant tracks. We now apply the Levenshteinalgorithm to spectra or formant value bundles instead of transcription segments. From these acoustic representations we got the best results usingthe formant track representation. However the transcription-based Levenshtein distances correlate still more closely. In the acoustic signalthe speaker-dependent influence is kept to some extent, while a transcriberabstracts from voice quality. Using more samples per dialect word (instead of only one as in our research) should improve the accuracy of the measurements.
[1]
John Nerbonne,et al.
Phonetic Distance between Dutch Dialects
,
1996
.
[2]
W. Heeringa,et al.
De invloed van dominante talen op het lexicon en de fonologie van Sardische dialecten
,
2002
.
[3]
Anil K. Jain,et al.
Algorithms for Clustering Data
,
1988
.
[4]
Brett Kessler,et al.
Computational dialectology in Irish Gaelic
,
1995,
EACL.
[5]
Martin Skjekkeland,et al.
Dei norske dialektane : tradisjonelle særdrag i jamføring med skriftmåla
,
1997
.
[6]
Louis ten Bosch,et al.
ASR, dialects, and acoustic/phonological distances
,
2000,
INTERSPEECH.
[7]
W. Heeringa,et al.
Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data
,
2004,
Language Variation and Change.