The Impact of an Accurate Vertical Localization with HRTFs on Short Explorations of Immersive Virtual Reality Scenarios

Achieving a full 3D auditory experience with head-related transfer functions (HRTFs) is still one of the main challenges of spatial audio rendering. HRTFs capture the listener's acoustic effects and personal perception, allowing immersion in virtual reality (VR) applications. This paper aims to investigate the connection between listener sensitivity in vertical localization cues and experienced presence, spatial audio quality, and attention. Two VR experiments with head-mounted display (HMD) and animated visual avatar are proposed: (i) a screening test aiming to evaluate the participants' localization performance with HRTFs for a non-visible spatialized audio source, and (ii) a 2 minute free exploration of a VR scene with five audiovisual sources in a both non-spatialized (2D stereo panning) and spatialized (free-field HRTF rendering) listening conditions. The screening test allows a distinction between good and bad localizers. The second one shows that no biases are introduced in the quality of the experience (QoE) due to different audio rendering methods; more interestingly, good localizers perceive a lower audio latency and they are less involved in the visual aspects.

[1]  Sophie Savel,et al.  Perceptual factors contribute more than acoustical factors to sound localization abilities with virtual sources , 2015, Front. Neurosci..

[2]  Steffen Lepa,et al.  A spatial audio quality inventory for virtual acoustic environments (SAQI) , 2014 .

[3]  Stefania Serafin,et al.  HOBA-VR: HRTF On Demand for Binaural Audio in immersive virtual reality environments , 2018 .

[4]  Youngjin Park,et al.  Enhanced Vertical Perception through Head-Related Impulse Response Customization Based on Pinna Response Tuning in the Median Plane , 2008, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[5]  Areti Andreopoulou,et al.  Investigation of Perceptual Interaural Time Difference Evaluation Protocols in a Binaural Context , 2016 .

[6]  G. J. Thomas Experimental study of the influence of vision on sound localization. , 1941 .

[7]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[8]  Simone Spagnol,et al.  Do We Need Individual Head-Related Transfer Functions for Vertical Localization? The Case Study of a Spectral Notch Distance Metric , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Bernhard U. Seeber,et al.  Subjective selection of non-individual head-related transfer functions , 2003 .

[10]  D. W. Batteau,et al.  The role of the pinna in human localization , 1967, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[11]  A John Van Opstal,et al.  Reconstructing spectral cues for sound localization from responses to rippled noise stimuli , 2017, PloS one.

[12]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[13]  Rick Kazman,et al.  Using 3D sound as a navigational aid in virtual environments , 2004, Behav. Inf. Technol..

[14]  Ying Zhang,et al.  Evaluation of Auditory and Visual Feedback on Task Performance in a Virtual Assembly Environment , 2006, PRESENCE: Teleoperators and Virtual Environments.

[15]  Karsten Bormann,et al.  Presence and the Utility of Audio Spatialization , 2005, Presence: Teleoperators & Virtual Environments.

[16]  Simone Spagnol,et al.  Enhancing vertical localization with image-guided selection of non-individual head-related transfer functions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Ravish Mehra,et al.  Perceptual thresholds of spatial audio update latency in virtual auditory and audiovisual environments , 2016 .

[18]  Woodrow Barfield,et al.  Presence in virtual environments as a function of visual and auditory cues , 1995, Proceedings Virtual Reality Annual International Symposium '95.

[19]  Simone Spagnol,et al.  Estimation and modeling of pinna-related transfer functions. , 2010 .

[20]  Pavel Zahorik,et al.  Auditory distance perception in humans: a review of cues, development, neuronal bases, and effects of sensory loss , 2015, Attention, perception & psychophysics.

[21]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[22]  Marc Schönwiesner,et al.  The Encoding of Sound Source Elevation in the Human Auditory Cortex , 2018, The Journal of Neuroscience.

[23]  Larry S. Davis,et al.  Rendering localized spatial audio in a virtual auditory space , 2004, IEEE Transactions on Multimedia.

[24]  Youn-sik Park,et al.  Modeling and Customization of Head-Related Impulse Responses Based on General Basis Functions in Time Domain , 2008 .

[25]  Brian D. Simpson,et al.  Free-Field Localization Performance With a Head-Tracked Virtual Auditory Display , 2015, IEEE Journal of Selected Topics in Signal Processing.

[26]  Robert Baumgartner,et al.  Acoustic and non-acoustic factors in modeling listener-specific performance of sagittal-plane sound localization , 2014, Front. Psychol..

[27]  J. C. Middlebrooks,et al.  Psychophysical customization of directional transfer functions for virtual sound localization. , 2000, The Journal of the Acoustical Society of America.

[28]  Federico Avanzini,et al.  Improving elevation perception with a tool for image-guided head-related transfer function selection , 2017 .

[29]  Durand R. Begault,et al.  3-D Sound for Virtual Reality and Multimedia Cambridge , 1994 .

[30]  J. C. Middlebrooks,et al.  Individual differences in external-ear transfer functions reduced by scaling in frequency. , 1999, The Journal of the Acoustical Society of America.

[31]  Michele Geronazzo,et al.  PHOnA: A public dataset of measured headphone transfer functions , 2014 .

[32]  R H Y So,et al.  Toward orthogonal non-individualised head-related transfer functions for forward and backward directional sound: cluster analysis and an experimental study , 2010, Ergonomics.

[33]  Catarina Mendonça,et al.  Learning Auditory Space: Generalization and Long-Term Effects , 2013, PloS one.

[34]  M. E. Altinsoy,et al.  Assessment of Binaural–Proprioceptive Interaction in Human-Machine Interfaces , 2013 .

[35]  Matthew Wright,et al.  Open Sound Control: an enabling technology for musical networking , 2005, Organised Sound.

[36]  Francis Rumsey,et al.  Computer Games and Multichannel Audio Quality Part 2 ' Evaluation of Time-Variant Audio Degradations Under Divided and Undivided Attention , 2003 .

[37]  Cumhur Erkut,et al.  Sonic Interactions in Virtual Reality: State of the Art, Current Challenges, and Future Directions , 2018, IEEE Computer Graphics and Applications.

[38]  Gerhard Eckel,et al.  Immersive audio-augmented environments: the LISTEN project , 2001, Proceedings Fifth International Conference on Information Visualisation.

[39]  Stefania Serafin,et al.  Participatory Amplitude Level Adjustment of Gesture Controlled Upper Body Garment Sound in Immersive Virtual Reality , 2014 .

[40]  Sebastian Möller,et al.  Impact of Spatial Audio Presentation on the Quality of Experience of Computer Games , 2017 .

[41]  S. Hochstein,et al.  Attentional control of early perceptual learning. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Simone Spagnol,et al.  Mixed structural modeling of head-related transfer functions for customized binaural audio delivery , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).