Perceptual irrelevancy removal in narrowband speech coding

A masking model originally designed for audio signals is applied to narrowband speech. The model is used to detect and remove the perceptually irrelevant simultaneously masked frequency components of a speech signal. Objective measurements have shown that the modified speech signal can be coded more efficiently than the original signal. Furthermore, it has been confirmed through perceptual evaluation that the removal of these frequency components does not cause significant degradation of the speech quality but rather, it has consistently improved the output quality of two standardized speech codecs. Thus, the proposed irrelevancy removal technique can be used at the front end of a speech coder to achieve enhanced coding efficiency.

[1]  Joon-Hyuk Chang,et al.  A preprocessor for low-bit-rate speech coding , 2002 .

[2]  Peter Kabal,et al.  Low distortion acoustic noise suppression using a perceptual model for speech signals , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[3]  Andreas Spanias,et al.  Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[4]  Christopher R. Cave Perceptual Modelling for Low-Rate Audio Coding , 2002 .

[5]  Christian Ritz,et al.  Low rate speech coding incorporating simultaneously masked spectrally weighted linear prediction , 2001, INTERSPEECH.

[6]  Ian Burnett,et al.  Source enhanced linear prediction of speech incorporating simultaneously masked spectral weighting , 2001 .

[7]  Joachim Thiemann Acoustic Noise Suppression for Speech Signals using Auditory Masking Eects , 2001 .

[8]  Ian Burnett,et al.  Exploiting simultaneously masked linear prediction in a WI speech coder , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[9]  Bernd Edler,et al.  Audio coding using a psychoacoustic pre- and post-filter , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10]  Joe F. Chicharo,et al.  Linear prediction incorporating simultaneous masking , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[11]  Thomas Sporer,et al.  PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .

[12]  U. Rass,et al.  Reduction of time‐domain aliasing in adaptive overlap‐add algorithms , 1999 .

[13]  I. Johansson,et al.  The adaptive multi-rate speech coder , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[14]  Peter Kabal,et al.  Improving perceptual coding of narrowband audio signals at low rates , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[16]  John Mourjopoulos,et al.  Speech enhancement based on audible noise suppression , 1997, IEEE Trans. Speech Audio Process..

[17]  Andreas Spanias,et al.  A review of algorithms for perceptual coding of digital audio signals , 1997, Proceedings of 13th International Conference on Digital Signal Processing.

[18]  John S. Collura,et al.  MELP: the new Federal Standard at 2400 bps , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Teresa H. Y. Meng,et al.  The digital prolate spheroidal window , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[20]  Robert J. Safranek,et al.  Signal compression based on models of human perception , 1993, Proc. IEEE.

[21]  John Mourjopoulos,et al.  Speech enhancement using psychoacoustic criteria , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  John G. Beerends,et al.  A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation , 1992 .

[23]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[24]  Thomas W. Parsons,et al.  Voice and Speech Processing , 1986 .

[25]  Ernst Terhardt,et al.  Calculating virtual pitch , 1979, Hearing Research.

[26]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[27]  Bishnu S. Atal,et al.  Predictive coding of speech signals and subjective error criteria , 1978, ICASSP.

[28]  B. Moore An Introduction to the Psychology of Hearing , 1977 .