Linear AM decomposition for sinusoidal audio coding

We present a novel decomposition for sinusoidal audio coding using amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a non-linear least squares minimization. It offers benefits in the modeling of transients in audio signals. We compare the decomposition to constant-amplitude sinusoidal coding using rate-distortion curves and listening tests. Both indicate that, at the same bit-rate, perceptually significant improvements can be achieved using the proposed decomposition.

[1]  Søren Holdt Jensen,et al.  Multiband amplitude modulated sinusoidal audio modeling , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Yair Shoham,et al.  Efficient bit allocation for an arbitrary set of quantizers [speech coding] , 1988, IEEE Trans. Acoust. Speech Signal Process..

[3]  Jesper Jensen,et al.  A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids , 2004, IEEE Transactions on Speech and Audio Processing.

[4]  A. Spanias,et al.  Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[5]  Richard Heusdens,et al.  A new psychoacoustical masking model for audio coding applications , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jesper Jensen,et al.  A comparison of differential schemes for low-rate sinusoidal audio coding , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[7]  Jeroen Breebaart,et al.  ADVANCES IN PARAMETRIC CODING FOR HIGH-QUALITY AUDIO , 2003 .

[8]  Richard Heusdens,et al.  Rate-distortion optimal sinusoidal modeling of audio and speech using psychoacoustical matching pursuits , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[10]  S. Marple Computing the discrete-time 'analytic' signal via FFT , 1997 .

[11]  Jian Li,et al.  Efficient mixed-spectrum estimation with applications to target feature extraction , 1995, Conference Record of The Twenty-Ninth Asilomar Conference on Signals, Systems and Computers.

[12]  S. van de Par,et al.  Rate-distortion efficient amplitude modulated sinusoidal audio coding , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[13]  P. Prandoni Optimal segmentation techniques for piecewise stationary signals , 1999 .

[14]  Gang Li,et al.  Signal representation based on instantaneous amplitude models with application to speech synthesis , 2000, IEEE Trans. Speech Audio Process..