Mixed Source Sound Field Translation for Virtual Binaural Application With Perceptual Validation

Non-interactive and linear experienceslike cinema film offer high quality surround sound audio to enhance immersion, however, the perspective is usually fixed to the recording microphone position. With the rise of virtual reality, there is a demand for recording and recreating real-world experiences that allow users to move throughout the reproduction. Sound field translation achieves this by building an equivalent environment of virtual sources to recreate the recording spatially. However, the technique remains to restrict the maximum distance a user can translate away from the recording microphone's perspective due to the discrete sampling by commercial higher order microphones only being capable of recording an acoustic sweet-spot. In this paper, we propose a method for binaurally reproducing a microphone recording in a virtual application that allows the user to freely translate their body further beyond the recording position. The method incorporates a mixture of near-field and far-field sources in a sparsely expanded virtual environment to maintain a perceptually accurate reproduction. We perceptually validate the method through a Multiple Stimulus with Hidden Reference and Anchor (MUSHRA) experiment. Compared to the planewave benchmark, the proposed method offers both improved source localizability and robustness to spectral distortions at translated listening positions. A cross-examination with numerical simulations demonstrated that the sparse expansion relaxes the inherent sweet-spot constraint, leading to the improved localizability for sparse environments. Additionally, the proposed method is seen to better reproduce the intensity and binaural room impulse response spectra of near-field environments, further supporting the perceptual results.

[1]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[2]  Prasanga N. Samarasinghe,et al.  Wavefield Analysis Over Large Areas Using Distributed Higher Order Microphones , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Emanuel A. P. Habets,et al.  Six-Degrees-of-Freedom Binaural Audio Reproduction of First-Order Ambisonics with Distance Information , 2018 .

[4]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[5]  Peter Dodds,et al.  Auralization systems for simulation of augmented reality experiences in virtual environments , 2019 .

[6]  Edgar Y. Choueiri,et al.  Soundfield Navigation using an Array of Higher-Order Ambisonics Microphones , 2016 .

[7]  Mark A. Poletti,et al.  Three-Dimensional Surround Sound Systems Based on Spherical Harmonics , 2005 .

[8]  Rudolf Rabenstein,et al.  Limitations in the extrapolation of wave fields from circular measurements , 2007, 2007 15th European Signal Processing Conference.

[9]  Shoichi Koyama,et al.  Sparse Representation of a Spatial Sound Field in a Reverberant Environment , 2019, IEEE Journal of Selected Topics in Signal Processing.

[10]  Sebastià V. Amengual Garí,et al.  Evaluation of Real-Time Sound Propagation Engines in a Virtual Reality Framework , 2019 .

[11]  Thushara D. Abhayapala,et al.  Theory and design of high order sound field microphones using spherical microphone array , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Dylan Menzies,et al.  Ambisonic Synthesis of Complex Sources , 2007 .

[13]  Marwan Al-Akaidi,et al.  Nearfield binaural synthesis and ambisonics. , 2007, The Journal of the Acoustical Society of America.

[14]  Gavin Kearney,et al.  Practical Recording Techniques for Music Production with Six-Degrees of Freedom Virtual Reality , 2018 .

[15]  Jérôme Antoni,et al.  Acoustic source identification: Experimenting the ℓ1 minimization approach , 2013 .

[16]  Joseph G. Tylka,et al.  Performance of Linear Extrapolation Methods for Virtual Sound Field Navigation , 2020 .

[17]  Efren Fernandez-Grande,et al.  Sound field reconstruction using a spherical microphone array. , 2016, The Journal of the Acoustical Society of America.

[18]  Lachlan Birnie,et al.  Sound Field Translation Methods for Binaural Reproduction , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[19]  Sascha Spors,et al.  Localization Properties of Data-based Binaural Synthesis including Translatory Head-Movements , 2014 .

[20]  Rodney A. Kennedy,et al.  Intrinsic Limits of Dimensionality and Richness in Random Multipath Fields , 2007, IEEE Transactions on Signal Processing.

[21]  Boaz Rafaely,et al.  Analysis and design of spherical microphone arrays , 2005, IEEE Transactions on Speech and Audio Processing.

[22]  J. B. Fahnline,et al.  A method for computing acoustic fields based on the principle of wave superposition , 1989 .

[23]  Earl G Williams,et al.  Study of the comparison of the methods of equivalent sources and boundary element methods for near-field acoustic holography. , 2006, The Journal of the Acoustical Society of America.

[24]  Jerome Daniel,et al.  Spatial Sound Encoding Including Near Field Effect: Introducing Distance Coding Filters and a Viable, New Ambisonic Format , 2003 .

[25]  Peter Jax,et al.  Translation of a Higher Order Ambisonics Sound Scene Based on Parametric Decomposition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Eric A. Lehmann,et al.  Reverberation-Time Prediction Method for Room Impulse Responses Simulated with the Image-Source Model , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[27]  Edgar Y. Choueiri,et al.  Comparison of Techniques for Binaural Navigation of Higher-Order Ambisonic Soundfields , 2015 .

[28]  Yukio Iwaya,et al.  3D Spatial Sound Systems Compatible with Human's Active Listening to Realize Rich High-Level kansei Information , 2012 .

[29]  Sascha Spors,et al.  Data-Based Binaural Synthesis Including Rotational and Translatory Head-Movements , 2013 .

[30]  Emanuel A. P. Habets,et al.  Geometry-Based Spatial Sound Acquisition Using Distributed Microphone Arrays , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Jörg Fliege,et al.  The distribution of points on the sphere and corresponding cubature formulae , 1999 .

[32]  Georgios B. Giannakis,et al.  Sound Field Reproduction using the Lasso , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[33]  Angeliki Xenaki,et al.  Compressive sensing with a spherical microphone array. , 2016, The Journal of the Acoustical Society of America.

[34]  P. Gerstoft,et al.  A sparse equivalent source method for near-field acoustic holography. , 2017, The Journal of the Acoustical Society of America.

[35]  Tomasz Zernicki,et al.  Toward Six Degrees of Freedom Audio Recording and Playback Using Multiple Ambisonics Sound Fields , 2019 .

[36]  Yonggang Hu,et al.  Modeling Characteristics of Real Loudspeakers Using Various Acoustic Models: Modal-domain Approaches , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Boaz Rafaely,et al.  Loudness stability of binaural sound with spherical harmonic representation of sparse head-related transfer functions , 2019, EURASIP J. Audio Speech Music. Process..

[38]  Wotao Yin,et al.  Iteratively reweighted algorithms for compressive sensing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[39]  Yan Wang,et al.  Translations of spherical harmonics expansion coefficients for a sound field using plane wave expansions. , 2018, The Journal of the Acoustical Society of America.

[40]  Joseph G. Tylka,et al.  Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation , 2019 .

[41]  R. Duraiswami,et al.  Plane-wave decomposition analysis for spherical microphone arrays , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[42]  Method for the subjective assessment of intermediate quality level of , 2014 .

[43]  Sascha Spors,et al.  Physical Properties of Modal Beamforming in the Context of Data-Based Sound Reproduction , 2015 .

[44]  Shuichi Sakamoto,et al.  Spatial accuracy of binaural synthesis from rigid spherical microphone array recordings , 2017 .

[46]  Ville Pulkki,et al.  Synthesis of Complex Sound Scenes with Transformation of Recorded Spatial Sound in Virtual Reality , 2015 .

[47]  Angie Sarkissian,et al.  Method of superposition applied to patch near-field acoustic holography , 2005 .

[48]  W. Marsden I and J , 2012 .

[49]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[50]  Shuichi Sakamoto,et al.  Extended sound field recording using position information of directional sound sources , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[51]  Robert Höldrich,et al.  A 3D Ambisonic Based Binaural Sound Reproduction System , 2003 .

[52]  Joseph G. Tylka,et al.  Models for evaluating navigational techniques for higher-order ambisonics , 2017 .

[53]  Satoru Emura Sound field estimation using two spherical microphone arrays , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[54]  Filippo Maria Fazi,et al.  Velocity controlled sound field reproduction by non-uniformly spaced loudspeakers , 2016 .

[55]  Boaz Rafaely,et al.  Spherical Microphone Array Beam Steering Using Wigner-D Weighting , 2008, IEEE Signal Processing Letters.

[56]  Thushara D. Abhayapala,et al.  Reproduction of a plane-wave sound field using an array of loudspeakers , 2001, IEEE Trans. Speech Audio Process..

[57]  Sascha Spors Modal Bandwidth Reduction in Data-based Binaural Synthesis including Translatory Head-movements , 2015 .

[58]  Peter Grosche,et al.  A Cross-Evaluated Database of Measured and Simulated HRTFs Including 3D Head Meshes, Anthropometric Features, and Headphone Impulse Responses , 2019, Journal of the Audio Engineering Society.

[59]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[60]  Ramani Duraiswami,et al.  Regularized HRTF fitting using spherical harmonics , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[61]  S. Elliott,et al.  AN EQUIVALENT SOURCE TECHNIQUE FOR CALCULATING THE SOUND FIELD INSIDE AN ENCLOSURE CONTAINING SCATTERING OBJECTS , 1998 .

[62]  Stefan Weinzierl,et al.  Binaural Resynthesis for Comparative Studies of Acoustical Environments , 2007 .

[63]  Joseph G. Tylka,et al.  Fundamentals of a parametric method for virtual navigation within an array of ambisonics microphones , 2020 .

[64]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[65]  Thushara D. Abhayapala,et al.  Mode Domain Spatial Active Noise Control Using Sparse Signal Representation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[66]  R. Duraiswami,et al.  Insights into head-related transfer function: Spatial dimensionality and continuous representation. , 2010, The Journal of the Acoustical Society of America.

[67]  Holography Book,et al.  Fourier Acoustics Sound Radiation And Nearfield Acoustical Holography , 2016 .

[68]  Brinkmann Fabian,et al.  The HUTUBS head-related transfer function (HRTF) database , 2019 .