Individualized Head-Related Transfer Functions: Efficient Modeling and Estimation from Small Sets of Spatial Samples

This dissertation develops and evaluates a novel way of modeling and estimating individualized head-related transfer functions (HRTFs) from a priori information and a limited number of acoustic measurements. Head-related transfer functions represent and describe the acoustic transformations caused by interactions of sound with a listener’s head, shoulders, and outer ears which give a sound its directional characteristics. Once this transformation is measured, it can be combined with any non-directional sound to give a listener the perceptual illusion that the sound originated from a designated location in space. The ability to manipulate these spatial auditory cues has many applications for both immersive and informational spatial auditory displays (SADs). Unfortunately, high fidelity SADs require the use of individualized head-related transfer functions that are measured for the specific end user of the spatial auditory display. These measurements are typically impractical for widespread commercialization, in large part because traditional HRTF measurement techniques require several hundred spatial locations to be measured around a listener, a procedure which requires expensive and complicated equipment, and which can take upwards of two hours. This dissertation introduces a novel way to represent a spatially continuous HRTF with a relatively small number of parameters via a spherical harmonic decomposition. This representation is shown to be perceptually equivalent to a full HRTF in terms of the resulting localization accuracy, while providing a convenient form for making HRTF comparisons across individuals and spatial locations. With this new framework it is shown that HRTFs from a large group of individuals can be modeled effectively with a multivariate normal distribution, and that this underlying distribution can be used to provide a priori information for individualized HRTF estimation allowing accurate estimates from as few as 12 spatially distributed locations. The new representation is also shown to be particularly well suited for the separation of individual and non-individual components of the HRTF. Analysis of the underlying distributions characterizing the HRTF, as well as perceptual evidence, indicates that only the spatial variation in an HRTF which corresponds to the vertical and front-back dimensions of a sound source location need to be individualized. This finding forms the basis of what is referred to as the sectoral HRTF model. The sectoral HRTF model is shown to be valid in terms of localization accuracy and provides a way to capture the individual components in an HRTF with only nine parameters at a single frequency. The relationship between these parameters and the spatial variance along the vertical and front-back dimensions is also exploited to allow relatively accurate HRTF estimation from acoustic measurements that are limited to a single plane. Together, these developments provide significant advancements in the ease with which individualized HRTFs are collected, as well as a wealth of new information relvent to efficient HRTF modeling.

[1]  Youngjin Park,et al.  Optimization of spherical and spheroidal head model for Head Related Transfer Function customization: Magnitude comparison , 2008, 2008 International Conference on Control, Automation and Systems.

[2]  Youn-sik Park,et al.  Analysis of individual differences in Head-Related Transfer Functions by spectral distortion , 2009, 2009 ICCAS-SICE.

[3]  Don H. Johnson,et al.  Statistical Signal Processing , 2009, Encyclopedia of Biometrics.

[4]  Matti Karjalainen,et al.  Objective and Subjective Evaluation of Head-Related Transfer Function Filter Design , 1999 .

[5]  Yukio Iwaya,et al.  Estimation of interaural level difference based on anthropometry and its effect on sound localization. , 2007, The Journal of the Acoustical Society of America.

[6]  Timothy R. Anderson,et al.  Factors Affecting the Relative Salience of Sound Localization Cues , 2014 .

[7]  Rick L. Jenison A spherical basis function neural network for pole-zero modeling of head-related transfer functions , 1995, Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Accoustics.

[8]  Rodney A. Kennedy,et al.  Efficiency evaluation and orthogonal basis determination in functional HRTF modeling , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Russell L. Martin,et al.  Interpolation of Head-Related Transfer Functions , 2007 .

[10]  M. Alex O. Vasilescu,et al.  A Multilinear (Tensor) Framework for HRTF Analysis and Synthesis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[12]  F L Wightman,et al.  Headphone simulation of free-field listening. II: Psychophysical validation. , 1989, The Journal of the Acoustical Society of America.

[13]  Bosun Xie,et al.  Recovery of individual head-related transfer functions from a small set of measurements. , 2012, The Journal of the Acoustical Society of America.

[14]  Richard O. Duda,et al.  Structural composition and decomposition of HRTFs , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[15]  Elizabeth M. Wenzel,et al.  Localization in Virtual Acoustic Displays , 1992, Presence: Teleoperators & Virtual Environments.

[16]  Christof Faller,et al.  Sound Field Analysis along a Circle and Its Applications to HRTF Interpolation , 2008 .

[17]  Rodney A. Kennedy,et al.  Efficient Continuous HRTF Model Using Data Independent Basis Functions: Experimentally Guided Approach , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Andreas Silzle Selection and Tuning of HRTFs , 2002 .

[19]  Douglas S. Brungart,et al.  In-Flight Navigation Using Head-Coupled and Aircraft-Coupled Spatial Audio Cues , 2007 .

[20]  Russell L. Martin,et al.  Localization of Virtual Sound as a Function of Head-Related Impulse Response Duration , 2002 .

[21]  Liang Chen,et al.  The Estimation of Personalized HRTFs in Individual VAS , 2008, 2008 Fourth International Conference on Natural Computation.

[22]  F L Wightman,et al.  Headphone simulation of free-field listening. I: Stimulus synthesis. , 1989, The Journal of the Acoustical Society of America.

[23]  Peter Balazs,et al.  Multiple Exponential Sweep Method for Fast Measurement of Head-Related Transfer Functions , 2007 .

[24]  Seok-Pil Lee,et al.  A Relevant Distance Criterion for Interpolation of Head-Related Transfer Functions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Rozenn Nicol,et al.  Head-Related Transfer Functions Reconstruction from Sparse Measurements Considering a Priori Knowledge from Database Analysis: A Pattern Recognition Approach , 2008 .

[26]  Russell L. Martin,et al.  Free-Field Equivalent Localization of Virtual Audio , 2001 .

[27]  V. Ralph Algazi,et al.  Physical and Filter Pinna Models Based on Anthropometry , 2007 .

[28]  Ramani Duraiswami,et al.  Extracting the frequencies of the pinna spectral notches in measured head related impulse responses. , 2004, The Journal of the Acoustical Society of America.

[29]  Larry S. Davis,et al.  Virtual audio system customization using visual matching of ear parameters , 2002, Object recognition supported by user interaction for service robots.

[30]  Ramani Duraiswami,et al.  The manifolds of spatial hearing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[31]  Thushara D. Abhayapala,et al.  Novel head related transfer function model for sound source localisation , 2010, 2010 4th International Conference on Signal Processing and Communication Systems.

[32]  Larry S. Davis,et al.  High Order Spatial Audio Capture and Its Binaural Head-Tracked Playback Over Headphones with HRTF Cues , 2005 .

[33]  Mariane R. Petraglia,et al.  HRTF interpolation in the wavelet transform domain , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[34]  Yong Fang,et al.  Interpolation of head-related transfer functions using spherical fourier expansion , 2009 .

[35]  Ramani Duraiswami,et al.  Regularized HRTF fitting using spherical harmonics , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[36]  Douglas Brungart,et al.  Spectral HRTF enhancement for improved vertical-polar auditory localization , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[37]  Russell L. Martin,et al.  Variability in the headphone-to-ear-canal transfer function , 2002 .

[38]  E. Langendijk,et al.  Contribution of spectral cues to human sound localization. , 1999, The Journal of the Acoustical Society of America.

[39]  H. Steven Colburn,et al.  Role of spectral detail in sound-source localization , 1998, Nature.

[40]  V. Choqueuse,et al.  Individualized HRTFs from few measurements: a statistical learning approach , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[41]  F. Wightman,et al.  A model of head-related transfer functions based on principal components analysis and minimum-phase reconstruction. , 1992, The Journal of the Acoustical Society of America.

[42]  J. D. Harris Localization of Sound: Theory and Applications. , 1982 .

[43]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[44]  A. Mills On the minimum audible angle , 1958 .

[45]  E. Langendijk,et al.  Fidelity of three-dimensional-sound reproduction using a virtual auditory display. , 2000, The Journal of the Acoustical Society of America.

[46]  F. Wightman,et al.  The dominant role of low-frequency interaural time differences in sound localization. , 1992, The Journal of the Acoustical Society of America.

[47]  Rodney A. Kennedy,et al.  Modal expansion of HRTFs: Continuous representation in frequency-range-angle , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[48]  Michael Friis Sørensen,et al.  Head-Related Transfer Functions of Human Subjects , 1995 .

[49]  R. Klatzky,et al.  Nonvisual navigation by blind and sighted: assessment of path integration ability. , 1993, Journal of experimental psychology. General.

[50]  Larry S. Davis,et al.  HRTF personalization using anthropometric measurements , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[51]  E. Macpherson,et al.  Binaural weighting of monaural spectral cues for sound localization. , 2007, The Journal of the Acoustical Society of America.

[52]  J. Hebrank,et al.  Spectral cues used in the localization of sound sources on the median plane. , 1974, The Journal of the Acoustical Society of America.

[53]  Gerald Enzner,et al.  3D reconstruction of HRTF-fields from 1D continuous measurements , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[54]  H. Colburn,et al.  Sensitivity of human subjects to head-related transfer-function phase spectra. , 1999, The Journal of the Acoustical Society of America.

[55]  Richard O. Duda,et al.  Modeling head related transfer functions , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[56]  Akio Ando,et al.  Estimation of individualized head-related transfer function based on principal component analysis , 2009 .

[57]  Elizabeth M. Wenzel,et al.  A software-based system for interactive spatial sound synthesis , 2000 .

[58]  Gregory H. Wakefield,et al.  Introduction to Head-Related Transfer Functions (HRTFs): Representations of HRTFs in Time, Frequency, and Space , 2001 .

[59]  T. Anderson,et al.  Binaural and spatial hearing in real and virtual environments , 1997 .

[60]  Gavriel Salvendy,et al.  Individualized head-related transfer functions based on population grouping. , 2008, The Journal of the Acoustical Society of America.

[61]  F L Wightman,et al.  Monaural sound localization revisited. , 1997, The Journal of the Acoustical Society of America.

[62]  L. Rayleigh,et al.  XII. On our perception of sound direction , 1907 .

[63]  Anthony I. Tew,et al.  Analyzing head-related transfer function measurements using surface spherical harmonics , 1998 .

[64]  M. Morimoto,et al.  Localization cues of sound sources in the upper hemisphere. , 1984 .

[65]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[66]  Rodney A. Kennedy,et al.  On High-Resolution Head-Related Transfer Function Measurements: An Efficient Sampling Scheme , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[67]  F. Asano,et al.  Role of spectral cues in median plane localization. , 1990, The Journal of the Acoustical Society of America.

[68]  R. Duraiswami,et al.  Insights into head-related transfer function: Spatial dimensionality and continuous representation. , 2010, The Journal of the Acoustical Society of America.

[69]  J. C. Middlebrooks Narrow-band sound localization related to external ear acoustics. , 1992, The Journal of the Acoustical Society of America.

[70]  R. Duda,et al.  Approximating the head-related transfer function using simple geometric models of the head and torso. , 2002, The Journal of the Acoustical Society of America.

[71]  Kai Liu,et al.  A reduced order model of head-related impulse responses based on independent spatial feature extraction , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[72]  Rudolf Susnik,et al.  Resolution enhancement of a general HRTF library , 2005 .

[73]  Ramani Duraiswami,et al.  Fast head-related transfer function measurement via reciprocity. , 2006, The Journal of the Acoustical Society of America.

[74]  Gregory H Wakefield,et al.  State-space models of head-related transfer functions for virtual auditory scene synthesis. , 2009, The Journal of the Acoustical Society of America.

[75]  B. V. Van Veen,et al.  A spatial feature extraction and regularization model for the head-related transfer function. , 1995, The Journal of the Acoustical Society of America.

[76]  William L. Martens,et al.  Principal Components Analysis and Resynthesis of Spectral Cues to Perceived Direction , 1987, ICMC.

[77]  J. C. Middlebrooks Virtual localization improved by scaling nonindividualized external-ear transfer functions in frequency. , 1999, Journal of the Acoustical Society of America.