Modeling Spot Microphone Signals using the Sinusoidal Plus Noise Approach

This paper focuses on high-fidelity multichannel audio coding based on an enhanced adaptation of the well-known sinusoidal plus noise model (SNM). Sinusoids cannot be used per se for high-quality audio modeling because they do not represent all the audible information of a recording. The noise part has also to be treated to avoid an artificial sounding resynthesis of the audio signal. Generally, the encoding process needs much higher bitrates for the noise part than the sinusoidal one. Our objective is to encode spot microphone signals using the SNM, by taking advantage of the interchannel similarities to achieve low bitrates. We demonstrate that for a given multichannel audio recording, the noise part for each spot microphone signal (before the mixing stage) can be obtained by using its noise envelope to transform the noise part of just one of the signals (the so-called "reference signal", which is fully encoded).

[1]  K. Karadimou,et al.  Multichannel Audio Modeling and Coding Using a Multiband Source/Filter Model , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[2]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[3]  Athanasios Mouchtaris,et al.  Sinusoidal modeling of spot microphone signals based on noise transplantation for multichannel audio coding , 2007, 2007 15th European Signal Processing Conference.

[4]  A. A. Sawchuk,et al.  From remote media immersion to Distributed Immersive Performance , 2003, ETP '03.

[5]  Julius O. Smith,et al.  Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[6]  Jürgen Herre,et al.  Intensity Stereo Coding , 1994 .

[7]  Jeroen Breebaart,et al.  Parametric Coding of Stereo Audio , 2005, EURASIP J. Adv. Signal Process..

[8]  Christof Faller,et al.  Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[9]  Xavier Serra,et al.  A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[10]  Julius O. Smith,et al.  Multiresolution sinusoidal modeling for wideband audio with modifications , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Jesper Jensen,et al.  Perceptual linear predictive noise modelling for sinusoid-plus-noise audio coding , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  C.-C. Jay Kuo,et al.  High-fidelity multichannel audio coding with Karhunen-Loeve transform , 2003, IEEE Trans. Speech Audio Process..

[13]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[14]  Michael M. Goodwin Residual modeling in music analysis-synthesis , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[15]  J. D. Johnston,et al.  Sum-difference stereo transform coding , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Yannis Stylianou,et al.  Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..