A Minimal Personalization of Dynamic Binaural Synthesis with Mixed Structural Modeling and Scattering Delay Networks

This paper provides a small set of essential parameters for a personalized and effective real-time auralization with headphones. An image-guided procedure with two 2D images of the user’s head guides the mixed structural modeling of head-related transfer function (HRTF), combining a spherical head model with ear displacement with the HRTF high-frequency magnitude selected from a database according to ear anthropometry. Room acoustics phenomena are simplified following the scattering delay network (SDN) approach which allows an accurate spatialization of first order reflections. Finally, statically significant improvements in localization performances within a virtual reality (VR) test allow to identify some benefits of the proposed customized auralization model compared to the widely used higher-order ambisonics (HOA) rendering with generic HRTFs.

[1]  Federico Avanzini,et al.  Applying a Single-Notch Metric to Image-Guided Head-Related Transfer Function Selection for Improved Vertical Localization , 2019, Journal of the Audio Engineering Society.

[2]  Vesa Välimäki,et al.  Fifty Years of Artificial Reverberation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Woon-Seng Gan,et al.  Natural Listening over Headphones in Augmented Reality Using Adaptive Filtering Techniques , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  David Romblom,et al.  Optimization and Prediction of the Spherical and Ellipsoidal ITD Model Parameters Using Offset Ears , 2018 .

[5]  V. Ralph Algazi,et al.  An adaptable ellipsoidal head model for the interaural time difference , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Simone Spagnol,et al.  Mixed structural modeling of head-related transfer functions for customized binaural audio delivery , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[7]  Angelo Farina,et al.  Real-Time Conversion of Sensor Array Signals into Spherical Harmonic Signals with Applications to Spatially Localized Sub-Band Sound-Field Analysis , 2018 .

[8]  Mark R. Anderson,et al.  Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions on the spatial perception of a virtual speech source. , 2001, Journal of the Audio Engineering Society. Audio Engineering Society.

[9]  Larry S. Davis,et al.  High Order Spatial Audio Capture and Its Binaural Head-Tracked Playback Over Headphones with HRTF Cues , 2005 .

[10]  B F Katz,et al.  Boundary element method calculation of individual head-related transfer function. I. Rigid model calculation. , 2001, The Journal of the Acoustical Society of America.

[11]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[12]  Cumhur Erkut,et al.  Sonic Interactions in Virtual Reality: State of the Art, Current Challenges, and Future Directions , 2018, IEEE Computer Graphics and Applications.

[13]  R H Y So,et al.  Toward orthogonal non-individualised head-related transfer functions for forward and backward directional sound: cluster analysis and an experimental study , 2010, Ergonomics.

[14]  Michael Vorlnder,et al.  Auralization: Fundamentals of Acoustics, Modelling, Simulation, Algorithms and Acoustic Virtual Reality , 2020 .

[15]  Andreas Zell,et al.  Head measurements from 3D point clouds , 2016, 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA).

[16]  Thibaud Leclère,et al.  On the externalization of sound sources with headphones without reference to a real source. , 2019, The Journal of the Acoustical Society of America.

[17]  B. Katz,et al.  Framework for Real-Time Auralization in Architectural Acoustics , 2008 .

[18]  H. Gamper Head-related transfer function interpolation in azimuth, elevation, and distance. , 2013, The Journal of the Acoustical Society of America.

[19]  Piotr Majdak,et al.  3-D localization of virtual sound sources: Effects of visual environment, pointing method, and training , 2010, Attention, perception & psychophysics.

[20]  R. Duda,et al.  Approximating the head-related transfer function using simple geometric models of the head and torso. , 2002, The Journal of the Acoustical Society of America.

[21]  Yukio Iwaya Individualization of head-related transfer functions with tournament-style listening test: Listening with other's ears , 2006 .

[22]  Angelo Farina,et al.  Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique , 2000 .

[23]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[24]  Julius O. Smith,et al.  Efficient Synthesis of Room Acoustics via Scattering Delay Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[25]  Nikunj Raghuvanshi,et al.  Blind Room Volume Estimation from Single-channel Noisy Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Stefania Serafin,et al.  The Impact of an Accurate Vertical Localization with HRTFs on Short Explorations of Immersive Virtual Reality Scenarios , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[27]  Dinesh Manocha,et al.  Acoustic Classification and Optimization for Multi-Modal Rendering of Real-World Scenes , 2018, IEEE Transactions on Visualization and Computer Graphics.