Rate-distortion function for speech coding based on perceptual distortion measure

The authors (1992) proposed a perceptual distortion measure for speech coders using an auditory (cochlear) model. This measure evaluates the neural-firing cross-entropy of the coded speech with respect to that of the original speech. Here the output space of the cochlear model is explored using this measure, in order to verify the existence of the pitch and formant information. A rate-distortion analysis for speech coding is provided. A lower bound to the rate-distortion function is evaluated based on the distortion measure, and the exact rate-distortion function is computed using the Blahut (1972) algorithm. Four state-of-the-art speech coders with rates ranging from 4.8 kb/s (CELP) to 32 kb/s (ADPCM) are studied from the viewpoint of their performance with respect to the rate-distortion limits.<<ETX>>