On using the Itakura-Saito measures for speech coder performance evaluation

The purpose of this paper is to discuss theoretical, as well as psychophysical, aspects of using the Itakura-Saito type of measures for evaluating the quality of coded speech. We present psychoacoustic interpretations of the measures and identify their effectiveness as well as limitations within the theoretical framework of a generalized waveform coder distortion model. The discussions then point out some specific issues to be resolved through psychoacoustic research effort.

[1]  Lawrence R. Rabiner,et al.  On creating reference templates for speaker independent recognition of isolated words , 1978 .

[2]  Nuggehally Sampath Jayant,et al.  LPC analysis/Synthesis from speech inputs containing quantizing noise or additive white noise , 1976 .

[3]  M Nakatsui,et al.  Subjective speech-to-noise ratio as a measure of speech quality for digital waveform coders. , 1982, The Journal of the Acoustical Society of America.

[4]  T. Martin,et al.  On the effects of varying filter bank parameters on isolated word recognition , 1982 .

[5]  P. Noll A comparative study of various quantization schemes for speech encoding , 1975, The Bell System Technical Journal.

[6]  A. Gray,et al.  Distance measures for speech processing , 1976 .

[7]  A. Gray,et al.  Distortion performance of vector quantization for LPC voice coding , 1982 .

[8]  R. Hellman Asymmetry of masking between noise and tone , 1972 .

[9]  P. Noll,et al.  A comparison of the performance of four low-bit-rate speech waveform coders , 1979, The Bell System Technical Journal.

[10]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[11]  E. Protonotarios,et al.  Response of delta modulation to Gaussian signals , 1969 .

[12]  Jont B. Allen,et al.  Short term spectral analysis, synthesis, and modification by discrete Fourier transform , 1977 .

[13]  Thomas P. Barnwell,et al.  Speech Quality Measurement , 1977 .

[14]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[15]  M. R. Schroeder,et al.  Loudness of Noise in the Presence of Tones: Measurements and Nonlinear Model Results , 1980 .

[16]  Biing-Hwang Juang,et al.  An 800 bit/s vector quantization LPC vocoder , 1982 .

[17]  J. E. Karlin,et al.  Iso‐Preference Method for Evaluating Speech Transmission Circuits , 1961 .

[18]  Jozef J. Zwislocki,et al.  Analysis of Some Auditory Characteristics. , 1963 .

[19]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[20]  C. Scagliola,et al.  Objective and subjective performance of tandem connections of waveform coders with an LPC vocoder , 1979, The Bell System Technical Journal.

[21]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[22]  L. Rabiner,et al.  An interpretation of the log likelihood ratio as a measure of waveform coder performance , 1980 .