A differential perceptual audio coding method with reduced bitrate requirements

A new audio transform coding technique is proposed that reduces the bitrate requirements of the perceptual transform audio coders by utilizing the stationarity characteristics of the audio signals. The method detects the frames that have significant audible content and codes them in a way similar to conventional perceptual transform coders. However, when successive data frames are found to be similar to those sections, then their audible differences only are coded. An error analysis for the proposed method is presented and results from tests on different types of audio material are listed, indicating that an average of 30% in compression gain (over the conventional perceptual audio coders bitrate) can be achieved, with a small deterioration in the audio quality of the coded signal. The proposed method has the advantage of easy adaptation within the perceptual transform coders architecture and adds only a small computational overhead to these systems.

[1]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[2]  Karlheinz Brandenburg,et al.  The iso/mpeg-audio codec: A generic standard for coding of high quality digital audio , 1992 .

[3]  Henrique S. Malvar Lapped transforms for efficient transform/subband coding , 1990, IEEE Trans. Acoust. Speech Signal Process..

[4]  Günther Theile,et al.  Low-Bit Rate Coding of High Quality Audio Signals , 1987 .

[5]  Richard V. Cox,et al.  The design of uniformly and nonuniformly spaced pseudoquadrature mirror filters , 1986, IEEE Trans. Acoust. Speech Signal Process..

[6]  Ernst Eberlein,et al.  Evaluation of Concealment Techniques for Compressed Digital Audio , 1993 .

[7]  Akihiko Sugiyama,et al.  A 128 kb/s Hi-Fi Audio CODEC Based on Adaptive Transform Coding with Adaptive Block Size MDCT , 1992, IEEE J. Sel. Areas Commun..

[8]  Jerry D. Gibson,et al.  Digital coding of waveforms: Principles and applications to speech and video , 1985, Proceedings of the IEEE.

[9]  B. Paillard,et al.  A Study of Strategies for the Perceptual Coding of Audio Signals , 1991 .

[10]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[11]  B. Paillard,et al.  PERCEVAL: Perceptual Evaluation of the Quality of Audio Signals , 1992 .

[12]  Richard Jamss Pinnell Adaptive Transform Coding of Speech Signals , 1982 .

[13]  Dieter Seitzer,et al.  Low Bit-Rate Coding of High Quality Digital Audio: Algorithms and Evaluation of Quality , 1989 .

[14]  John G. Beerends,et al.  A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation , 1992 .

[15]  John Mourjopoulos,et al.  Audio Coding Based on Subjective Differences , 1993 .

[16]  Richard F. Lyon,et al.  An analog electronic cochlea , 1988, IEEE Trans. Acoust. Speech Signal Process..

[17]  Nikil Jayant,et al.  Signal Compression: Technology Targets and Research Directions , 1992, IEEE J. Sel. Areas Commun..

[18]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[19]  Gerhard Stoll,et al.  Bitrate Reduction of High Quality Audio Signals by Modeling the Ears Masking Thresholds , 1990 .

[20]  Y. Mahieux,et al.  Transform coding of audio signals using correlation between successive transform blocks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[21]  Joseph Rothweiler,et al.  Polyphase quadrature filters-A new subband coding technique , 1983, ICASSP.

[22]  Kenzo Akagiri,et al.  ATRAC: Adaptive Transform Acoustic Coding for MiniDisc , 1992 .

[23]  Sarto Morissette,et al.  Transparent Coding of a Monophonic Audio Signal at 100 Kb/s , 1992 .

[24]  Ernst F Schroeder,et al.  Aspec-Adaptive Spectral Entropy Coding of High Quality Music Signals , 1991 .

[25]  Th. Sporer,et al.  The Use of Multirate Filter Banks for Coding of High Quality Digital Audio , 1992 .

[26]  Pierre Duhamel,et al.  A fast algorithm for the implementation of filter banks based on 'time domain aliasing cancellation' , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[27]  J. D. Johnston,et al.  Estimation of perceptual entropy using noise masking criteria , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28]  Karlheinz Brandenburg,et al.  Second Generation Perceptual Audio Coding: The Hybrid Code , 1990 .

[29]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[30]  Peter No,et al.  Digital Coding of Waveforms , 1986 .

[31]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[32]  R. Hellman Asymmetry of masking between noise and tone , 1972 .

[33]  Karlheinz Brandenburg OCF--A new coding algorithm for high quality sound signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34]  James M. Kates,et al.  A time-domain digital cochlear model , 1991, IEEE Trans. Signal Process..

[35]  Ernst Eberlein,et al.  Advanced Audio Measurement System Using Psychoacoustic Properties , 1992 .