A comparison between human vowel normalization strategies and acoustic vowel transformation techniques

Perceptual and acoustic representations of vowel data were compared directly to evaluate the perceptual relevance of several speaker normalization transformations. The acoustic representations consisted of raw F0 and formant data. The perceptual representations were obtained through an experimental procedure, with phonetically trained listeners as subjects. The raw acoustic data were transformed according to several normalization schemes. The perceptual and the acoustic representations were compared using regression techniques. A zscore-transformation of the raw data appeared to resemble the perceptual data.