Self-organized acoustic feature map in detection of fricative-vowel coarticulation.

The self-organizing map, a neural network algorithm of Kohonen, was used for the detection of coarticulatory variation of fricative [s] preceding vowels [a:], [i:], and [u:]. The results were compared with the psychoacoustic classification of the same samples to find out whether the map had extracted perceptually meaningful features of [s]. The map distinguished samples of [s] in front of [u:] from those in front of [a:] or [i:] throughout the fricative duration. Samples of [s] preceding [a:] and [i:] were distinguished from each other only just before (about 40 ms) the vowel onset. The results agreed with the perceptual classifications. Most judgments (82%) of [s] in front of [u:] were correct, and this variant of [s] was recognized from the first and second halves of segmented fricatives equally well. Samples of [s] in front of [a:] and [i:] were distinguished from each other less accurately. When halves of segmented [s] were perceptually judged, the differentiation between the following [a] and [i] was possible only on the basis of the second half of the fricative. The results demonstrate that the self-organizing map is a useful tool for the extraction of intersubject regularities in speech spectra. The map also provides an easily understandable, on-line, visualization of speech that can be used as feedback in therapy and education.