Combined spectral envelope normalization and subtraction of sinusoidal components in the ODFT and MDCT frequency domains

Recent research in high-quality audio coding seeks not only improved coding gains but also new functionalities such as easy semantic access to compressed audio material and audio modification in the compressed domain. These objectives imply the decomposition of the audio signal into several components of specific semantic value, such as sinusoidal components, that take advantage of selective coding and parametrization tools. We presume an MDCT based audio coding environment and present a new technique combining spectral envelope normalization with accurate subtraction of sinusoidal components in the MDCT frequency domain. It is shown how a parametrization of L stationary sinusoids in the complex ODFT spectrum can lead to the effective subtraction, in the real MDCT spectrum, of 3L spectral lines. A demonstration of the implementation of the technique is available on the Internet (see http://www.inescn.pt//spl sim/ajf/waspaa01/flattening.html.).

[1]  Julius O. Smith,et al.  Audio representations for data compression and compressed domain processing , 1998 .

[2]  Takehiro Moriya,et al.  A design of transform coder for both speech and audio signals at 1 bit/sample , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Andreas Spanias,et al.  Speech coding: a tutorial review , 1994, Proc. IEEE.

[4]  Mohamed A. Deriche,et al.  Hybrid LPC and discrete wavelet transform audio coding with a novel bit allocation algorithm , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  Robert J. Safranek,et al.  Signal compression based on models of human perception , 1993, Proc. IEEE.

[6]  Teresa H. Meng,et al.  A perceptually based audio signal model with application to scalable audio compression , 1999 .

[7]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[8]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[9]  Aníbal Ferreira Accurate estimation in the ODFT domain of the frequency, phase and magnitude of stationary sinusoids , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[10]  Eindhoven,et al.  Ep 1 881 486 B1 European Patent Specification Gb-a-2 353 926 @bullet Faller C Et Al: "efficient Representation of Spatial Audio Using Perceptual Parametrization" Ieee Workshop on Applications of Signal Processing to Audio and Acoustics , .