Low-dimensional representation of vowels based on all-pole modeling in the psychophysical domain

Abstract A novel speech analysis method which uses several established psychoacoustic concepts is applied to the analysis of vowels. This perceptually based linear predictive analysis (PLP) models the auditory spectrum by the spectrum of the low-order all-pole model. The auditory spectrum is derived from the speech waveform by critical-band filtering, equal-loudness curve pre-emphasis, and intensity-loudness root compression. We demonstrate through analysis of both natural and synthetic speech that psychoacoustic concepts of spectral auditory integration in vowel perception, namely the F 1, F 2′ concept of Carlson and Fant and the 3.5 Bark auditory integration concept of Chistovich, are well modeled by the PLP method. A complete speech analysis-synthesis system based on the PLP method is also described in the paper.