Perceptual Reproduction of Spatial Sound Using Loudspeaker-Signal-Domain Parametrization

Adaptive perceptual spatial sound reproduction techniques that employ a parametric model describing the properties of the sound field can reproduce spatial sound with high perceptual accuracy when compared to linear techniques. On the other hand, applying a sound-field model to control the reproduced sound may compromise the perceived quality of individual channels in cases where the model does not match the sound field. An alternative parametrization is proposed that estimates directly the perceptually relevant parameters for the target loudspeaker signals without modeling the sound field. At the synthesis stage, the loudspeaker signals with the target parametric properties are generated from the microphone signals with regularized least squares mixing and decorrelation. It is shown through listening experiments that the proposed method provides on average the overall perceived spatial sound reproduction quality of a state-of the-art parametric spatial sound reproduction technique, while solving the past shortcomings related to the perceived quality of the individual channels.

[1]  Sascha Disch,et al.  Reproducing Applause-Type Signals with Directional Audio Coding , 2011 .

[2]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[3]  Philip A. Nelson,et al.  Optimal regularisation for acoustic source reconstruction by inverse methods , 2004 .

[4]  Tapio Lokki,et al.  Directional Audio Coding: Virtual Microphone-Based Synthesis and Subjective Evaluation , 2009 .

[5]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[6]  Ramani Duraiswami,et al.  Flexible and Optimal Design of Spherical Microphone Arrays for Beamforming , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Earl Vickers Frequency-Domain Implementation of Time-Varying FIR Filters , 2012 .

[8]  Jürgen Herre,et al.  MPEG Spatial Audio Object Coding—The ISO/MPEG Standard for Efficient Coding of Interactive Audio Scenes , 2010 .

[9]  Stephanie Bertet,et al.  3D Sound Field Recording with Higher Order Ambisonics – Objective Measurements and Validation of a 4th order Spherical Microphone , 2006 .

[10]  Joshua Atkins,et al.  Robust beamforming and steering of arbitrary beam patterns using spherical arrays , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[11]  Jürgen Herre,et al.  MPEG Surround – the ISO/MPEG Standard for Efficient and Compatible Multi-Channel Audio Coding , 2007 .

[12]  Sascha Disch,et al.  The Transient Steering Decorrelator Tool in the Upcoming MPEG Unified Speech and Audio Coding Standard , 2011 .

[13]  Ville Pulkki,et al.  Spatial Sound Reproduction with Directional Audio Coding , 2007 .

[14]  Christof Faller,et al.  Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[15]  Juha Vilkamo,et al.  Minimization of Decorrelator Artifacts in Directional Audio Coding by Covariance Domain Rendering , 2013 .

[16]  Franz Zotter,et al.  All-Round Ambisonic Panning and Decoding , 2012 .

[17]  Joseph Rothweiler,et al.  Polyphase quadrature filters-A new subband coding technique , 1983, ICASSP.

[18]  Christof Faller,et al.  Binaural cue coding-Part II: Schemes and applications , 2003, IEEE Trans. Speech Audio Process..

[19]  Petra Ostermann,et al.  Conformal Array Antenna Theory And Design , 2016 .

[20]  Philippe-Aubert Gauthier,et al.  Beamforming regularization matrix and inverse problems applied to sound field measurement and extrapolation using microphone array , 2011 .

[21]  Juha Vilkamo,et al.  Optimized Covariance Domain Framework for TimeFrequency Processing of Spatial Audio , 2013 .

[22]  Per Ekstrand BANDWIDTH EXTENSION OF AUDIO SIGNALS BY SPECTRAL BAND REPLICATION , 2002 .

[23]  Hareo Hamada,et al.  Fast deconvolution of multichannel systems using regularization , 1998, IEEE Trans. Speech Audio Process..

[24]  Marc Moonen,et al.  Design of far-field and near-field broadband beamformers using eigenfilters , 2003, Signal Process..

[25]  D P Jarrett,et al.  Rigid sphere room impulse response simulation: algorithm and applications. , 2012, The Journal of the Acoustical Society of America.

[26]  Christof Faller,et al.  Spatial Audio Processing , 2007 .

[27]  Jeroen Breebaart,et al.  Parametric Coding of Stereo Audio , 2005, EURASIP J. Adv. Signal Process..

[28]  Philippe-Aubert Gauthier,et al.  Sound Field Reproduction of Real Flight Recordings in Cabin Mock-Up , 2013 .

[29]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[30]  Schuyler Quackenbush,et al.  The ISO/MPEG Unified Speech and Audio Coding Standard—Consistent High Quality for All Content Types and at All Bit Rates , 2013 .

[31]  Thushara D. Abhayapala,et al.  Reproduction of a plane-wave sound field using an array of loudspeakers , 2001, IEEE Trans. Speech Audio Process..

[32]  Michael A. Gerzon Applications of Blumlein Shuffling to Stereo Microphone Techniques , 1994 .

[33]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[34]  Ville Pulkki,et al.  Virtual Sound Source Positioning Using Vector Base Amplitude Panning , 1997 .

[35]  Jan Plogsties,et al.  MPEG-H Audio—The New Standard for Universal Spatial / 3D Audio Coding , 2014 .

[36]  Christof Faller,et al.  Spatial Audio Processing: MPEG Surround and Other Applications , 2007 .

[37]  Jerome Daniel,et al.  Further Investigations of High-Order Ambisonics and Wavefield Synthesis for Holophonic Sound Imaging , 2003 .