论文信息 - Multichannel Audio Coding for Multimedia Services in Intelligent Environments

Multichannel Audio Coding for Multimedia Services in Intelligent Environments

Audio is an integral component of multimedia services in intelligent environments. Use of multiple channels in audio capturing and rendering offers the advantage of recreating arbitrary acoustic environments, immersing the listener into the acoustic scene. On the other hand, multichannel audio contains a large degree of information which is highly demanding to transmit, especially for real-time applications. For this reason, a variety of compression methods have been developed for multichannel audio content. In this chapter, we initially describe the currently popular methods for multichannel audio compression. Low-bitrate encoding methods for multichannel audio have also been recently starting to attract interest, mostly towards extending MP3 audio coding to multichannel audio recordings, and these methods are also examined here. For synthesizing a truly immersive intelligent audio environment, interactivity between the user(s) and the audio environment is essential. Towards this goal, we present recently proposed multichannel-audio-specific models, namely the source/filter and the sinusoidal models, which allow for flexible manipulation and high-quality low-bitrate encoding, tailored for applications such as remote mixing and distributed immersive performances.

Athanasios Mouchtaris | Panagiotis Tsakalides | P. Tsakalides | A. Mouchtaris

[1] K. Karadimou,et al. Multichannel Audio Modeling and Coding Using a Multiband Source/Filter Model , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[2] Karlheinz Brandenburg,et al. MP3 and AAC Explained , 1999 .

[3] Methods for the subjective assessment of small impairments in audio systems , 2015 .

[4] Jürgen Herre,et al. MPEG Spatial Audio Coding / MPEG Surround: Overview and Current Status , 2005 .

[5] Julius O. Smith,et al. Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[6] R. Dressler. Dolby Surround Pro Logic II Decoder Principles of Operation , 2000 .

[7] Athanasios Mouchtaris,et al. Sinusoidal modeling of spot microphone signals based on noise transplantation for multichannel audio coding , 2007, 2007 15th European Signal Processing Conference.

[8] A. A. Sawchuk,et al. From remote media immersion to Distributed Immersive Performance , 2003, ETP '03.

[9] Takehiro Moriya,et al. High-quality audio-coding at less than 64 kbit/s by using transform-domain weighted interleave vector quantization (TwinVQ) , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10] C.-C. Jay Kuo,et al. High-fidelity multichannel audio coding with Karhunen-Loeve transform , 2003, IEEE Trans. Speech Audio Process..

[11] Julius O. Smith,et al. Multiresolution sinusoidal modeling for wideband audio with modifications , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[13] E. Owens,et al. An Introduction to the Psychology of Hearing , 1997 .

[14] Menno Wildeboer. INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 , 2009 .

[15] Christof Faller,et al. Binaural cue coding-Part II: Schemes and applications , 2003, IEEE Trans. Speech Audio Process..

[16] Jesper Jensen,et al. Perceptual linear predictive noise modelling for sinusoid-plus-noise audio coding , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17] Athanasios Mouchtaris,et al. Modeling Spot Microphone Signals using the Sinusoidal Plus Noise Approach , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[18] Marina Bosi,et al. ISO/IEC MPEG-2 Advanced Audio Coding: Overview and Applications , 1997 .

[19] Andreas Jakobsson,et al. Linear AM decomposition for sinusoidal audio coding , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[20] Athanasios Mouchtaris,et al. Multiresolution Source/Filter Model for Low Bitrate Coding of Spot Microphone Signals , 2008, EURASIP J. Audio Speech Music. Process..

[21] Ronaldus Maria Aarts,et al. Two-to-Five Channel Sound Processing * , 2002 .

[22] S. Haykin,et al. Adaptive Filter Theory , 1986 .

[23] Jeroen Breebaart,et al. Parametric Coding of Stereo Audio , 2005, EURASIP J. Adv. Signal Process..

[24] Heiko Purnhagen,et al. ASAC-Analysis/Synthesis Audio Codec for Very Low-Bit Rates , 1996 .

[25] Jürgen Herre,et al. Intensity Stereo Coding , 1994 .

[26] Heiko Purnhagen,et al. HILN-the MPEG-4 parametric audio coding tools , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[27] Jenq-Neng Hwang. Multimedia Networking: Digital audio coding , 2009 .

[28] W. Bastiaan Kleijn,et al. On frequency quantization in sinusoidal audio coding , 2005, IEEE Signal Processing Letters.

[29] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..

[30] K. Karadimou,et al. Packet Loss Concealment for Multichannel Audio Using the Multiband Source/Filter Model , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[31] P. Noll,et al. MPEG digital audio coding , 1997, IEEE Signal Process. Mag..

[32] J. D. Johnston,et al. Sum-difference stereo transform coding , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[34] Davis Pan,et al. A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[35] Christof Faller,et al. Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[36] Xavier Serra,et al. A sound analysis/synthesis system based on a deterministic plus stochastic decomposition , 1990 .

[37] Henrique S. Malvar. Lapped transforms for efficient transform/subband coding , 1990, IEEE Trans. Acoust. Speech Signal Process..

[38] Louis Dunn Fielder,et al. ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[39] Jesper Jensen,et al. A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids , 2004, IEEE Transactions on Speech and Audio Processing.

[40] A. Spanias,et al. Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[41] Athanasios Mouchtaris,et al. Virtual Microphones for Multichannel Audio Resynthesis , 2003, EURASIP J. Adv. Signal Process..

[42] Michael M. Goodwin. Residual modeling in music analysis-synthesis , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.