Improving elevation perception with a tool for image-guided head-related transfer function selection

This paper proposes an image-guided HRTF selection procedure that exploits the relation between features of the pinna shape and HRTF notches. Using a 2D image of a subject’s pinna, the procedure selects from a database the HRTF set that best fits the anthropometry of that subject. The proposed procedure is designed to be quickly applied and easy to use for a user without previous knowledge on binaural audio technologies. The entire process is evaluated by means of an auditory model for sound localization in the mid-sagittal plane available from previous literature. Using virtual subjects from a HRTF database, a virtual experiment is implemented to assess the vertical localization performance of the database subjects when they are provided with HRTF sets selected by the proposed procedure. Results report a statistically significant improvement in predictions of localization performance for selected HRTFs compared to KEMAR HRTF which is a commercial standard in many binaural audio solutions; moreover, the proposed analysis provides useful indications to refine the perceptually-motivated metrics that guides the selection.

[1]  Larry S. Davis,et al.  Rendering localized spatial audio in a virtual auditory space , 2004, IEEE Transactions on Multimedia.

[2]  Isabelle Viaud-Delmon,et al.  Ventriloquism aftereffects occur in the rear hemisphere , 2006, Neuroscience Letters.

[3]  Robert Baumgartner,et al.  Modeling sound-source localization in sagittal planes for human listeners. , 2014, The Journal of the Acoustical Society of America.

[4]  Durand R. Begault,et al.  Inter-Laboratory Round Robin HRTF Measurement Comparison , 2015, IEEE Journal of Selected Topics in Signal Processing.

[5]  Gaëtan Parseihian,et al.  Perceptually based head-related transfer function database optimization. , 2012, The Journal of the Acoustical Society of America.

[6]  Michele Geronazzo,et al.  Personalization support for binaural headphone reproduction in web browsers , 2015 .

[7]  Simone Spagnol,et al.  Mixed structural modeling of head-related transfer functions for customized binaural audio delivery , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[8]  H. Takemoto,et al.  Mechanism for generating peaks and notches of head-related transfer functions in the median plane. , 2012, The Journal of the Acoustical Society of America.

[9]  Simone Spagnol,et al.  On the Relation Between Pinna Reflection Patterns and Head-Related Transfer Function Features , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  W. G. Gardner,et al.  HRTF measurements of a KEMAR , 1995 .

[11]  Simone Spagnol,et al.  Estimation and modeling of pinna-related transfer functions. , 2010 .

[12]  Federico Avanzini,et al.  Evaluating vertical localization performance of 3D sound rendering models with a perceptual metric , 2015, 2015 IEEE 2nd VR Workshop on Sonic Interactions for Virtual Environments (SIVE).

[13]  Simone Spagnol,et al.  Enhancing vertical localization with image-guided selection of non-individual head-related transfer functions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Steffen Lepa,et al.  A spatial audio quality inventory for virtual acoustic environments (SAQI) , 2014 .

[15]  Kazuhiro Iida,et al.  Personalization of head-related transfer functions in the median plane based on the anthropometry of the listener's pinnae. , 2014, The Journal of the Acoustical Society of America.

[16]  Federico Avanzini,et al.  Influence of voxelization on finite difference time domain simulations of head-related transfer functions. , 2016, The Journal of the Acoustical Society of America.

[17]  Michele Geronazzo Mixed Structural Models for 3D Audio in Virtual Environments , 2014 .

[18]  Robert Baumgartner,et al.  Assessment of Sagittal-Plane Sound Localization Performance in Spatial-Audio Applications , 2013 .

[19]  Davide G. Tommasi,et al.  Age- and sex-related changes in the normal human ear. , 2009, Forensic science international.

[20]  Yukio Iwaya Individualization of head-related transfer functions with tournament-style listening test: Listening with other's ears , 2006 .

[21]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[22]  Jens Blauert,et al.  The Technology of Binaural Listening , 2013 .

[23]  Ramani Duraiswami,et al.  Extracting the frequencies of the pinna spectral notches in measured head related impulse responses. , 2004, The Journal of the Acoustical Society of America.

[24]  Federico Avanzini,et al.  Acoustic selfies for extraction of external ear features in mobile audio augmented reality , 2016, VRST.

[25]  Luis Álvarez,et al.  Normalization and feature extraction on ear images , 2012, 2012 IEEE International Carnahan Conference on Security Technology (ICCST).

[26]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[27]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).