Extracting the frequencies of the pinna spectral notches in measured head related impulse responses.

The head related impulse response (HRIR) characterizes the auditory cues created by scattering of sound off a person's anatomy. The experimentally measured HRIR depends on several factors such as reflections from body parts (torso, shoulder, and knees), head diffraction, and reflection/ diffraction effects due to the pinna. Structural models (Algazi et al., 2002; Brown and Duda, 1998) seek to establish direct relationships between the features in the HRIR and the anatomy. While there is evidence that particular features in the HRIR can be explained by anthropometry, the creation of such models from experimental data is hampered by the fact that the extraction of the features in the HRIR is not automatic. One of the prominent features observed in the HRIR, and one that has been shown to be important for elevation perception, are the deep spectral notches attributed to the pinna. In this paper we propose a method to robustly extract the frequencies of the pinna spectral notches from the measured HRIR, distinguishing them from other confounding features. The method also extracts the resonances described by Shaw (1997). The techniques are applied to the publicly available CIPIC HRIR database (Algazi et al., 2001c). The extracted notch frequencies are related to the physical dimensions and shape of the pinna.

[1]  F. Asano,et al.  Role of spectral cues in median plane localization. , 1990, The Journal of the Acoustical Society of America.

[2]  E. Shaw Transformation of sound pressure level from the free field to the eardrum in the horizontal plane. , 1974, The Journal of the Acoustical Society of America.

[3]  R. Duda,et al.  Range dependence of the response of a spherical head model , 1998 .

[4]  Paul M. Hofman,et al.  Relearning sound localization with new ears , 1998, Nature Neuroscience.

[5]  Abhijit Kulkarni,et al.  Infinite-impulse-response models of the head-related transfer function. , 1995, The Journal of the Acoustical Society of America.

[6]  Yuvi Kahana,et al.  Numerical Modelling of the Transfer Functions of a Dummy-Head and of the External Ear , 1999 .

[7]  A D Musicant,et al.  The influence of pinnae-based spectral cues on sound localization. , 1984, The Journal of the Acoustical Society of America.

[8]  R. Thouless Experimental Psychology , 1939, Nature.

[9]  G. F. Kuhn Model for the interaural time differences in the azimuthal plane , 1977 .

[10]  R Meddis,et al.  A physical model of sound diffraction and reflections in the human concha. , 1996, The Journal of the Acoustical Society of America.

[11]  Gregory H. Wakefield,et al.  Pole-zero approximations for head-related transfer functions using a logarithmic error criterion , 1997, IEEE Trans. Speech Audio Process..

[12]  J. Hebrank,et al.  Spectral cues used in the localization of sound sources on the median plane. , 1974, The Journal of the Acoustical Society of America.

[13]  B. Shinn-Cunningham,et al.  Tori of confusion: binaural localization cues for sources within reach of a listener. , 2000, The Journal of the Acoustical Society of America.

[14]  J. C. Middlebrooks Virtual localization improved by scaling nonindividualized external-ear transfer functions in frequency. , 1999, The Journal of the Acoustical Society of America.

[15]  E. Langendijk,et al.  Contribution of spectral cues to human sound localization. , 1999, The Journal of the Acoustical Society of America.

[16]  J. Hebrank,et al.  Pinna reflections as cues for localization. , 1974, The Journal of the Acoustical Society of America.

[17]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[18]  V. Ralph Algazi,et al.  The Use of Head-and-Torso Models for Improved Spatial Sound Synthesis , 2002 .

[19]  R. Duda,et al.  Modeling the Contralateral HRTF , 1999 .

[20]  B. Yegnanarayana,et al.  Significance of group delay functions in signal reconstruction from spectral magnitude or phase , 1984 .

[21]  Gregory H. Wakefield,et al.  Efficient model fitting using a genetic algorithm: pole-zero approximations of HRTFs , 2002, IEEE Trans. Speech Audio Process..

[22]  C L Patterson,et al.  Design of ARMA Digital Filters by Pole-Zero Decomposition , .

[23]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[24]  Larry S. Davis,et al.  Virtual audio system customization using visual matching of ear parameters , 2002, Object recognition supported by user interaction for service robots.

[25]  V. Ralph Algazi,et al.  An adaptable ellipsoidal head model for the interaural time difference , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[26]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[27]  Ramani Duraiswami,et al.  EXTRACTING SIGNIFICANT FEATURES FROM THE HRTF , 2003 .

[28]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[29]  E. Shaw,et al.  Sound pressure generated in an external-ear replica and real human ears by a nearby point source. , 1968, The Journal of the Acoustical Society of America.

[30]  E D Young,et al.  Neural organization and responses to complex stimuli in the dorsal cochlear nucleus. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[31]  V R Algazi,et al.  Elevation localization and head-related transfer function analysis at low frequencies. , 2001, The Journal of the Acoustical Society of America.

[32]  Simon R. Oldfield,et al.  Detection and discrimination of spectral peaks and notches at 1 and 8 kHz. , 1989, The Journal of the Acoustical Society of America.

[33]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[34]  C. Avendano,et al.  A head-and-torso model for low-frequency binaural elevation effects , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[35]  B. Yegnanarayana,et al.  Epoch extraction of voiced speech , 1975 .

[36]  Bill Gardner,et al.  HRTF Measurements of a KEMAR Dummy-Head Microphone , 1994 .

[37]  L. Mcbride,et al.  A technique for the identification of linear systems , 1965 .

[38]  Richard O. Duda,et al.  A structural model for binaural sound synthesis , 1998, IEEE Trans. Speech Audio Process..

[39]  B. Yegnanarayana Formant extraction from linear‐prediction phase spectra , 1978 .

[40]  Doris Kistler,et al.  Of vulcan ears, human ears and 'earprints' , 1998, Nature Neuroscience.

[41]  L. Rayleigh,et al.  XII. On our perception of sound direction , 1907 .

[42]  J. Brugge,et al.  Virtual-space receptive fields of single auditory nerve fibers. , 1993, Journal of neurophysiology.

[43]  Nobuhiko Kitawaki,et al.  Common-acoustical-pole and zero modeling of head-related transfer functions , 1999, IEEE Trans. Speech Audio Process..

[44]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[45]  B. Yegnanarayana Speech analysis by pole-zero decomposition of short-time spectra , 1981 .

[46]  J. Hebrank,et al.  Are two ears necessary for localization of sound sources on the median plane? , 1974, The Journal of the Acoustical Society of America.

[47]  Richard O. Duda,et al.  Structural composition and decomposition of HRTFs , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[48]  William M. Hartmann,et al.  How we localize sound , 1999 .

[49]  J. Brugge,et al.  Sensitivity of auditory nerve fibers to spectral notches. , 1993, Journal of neurophysiology.

[50]  M. Gardner,et al.  Problem of localization in the median plane: effect of pinnae cavity occlusion. , 1973, The Journal of the Acoustical Society of America.

[51]  Yuvi Kahana,et al.  Spatial Acoustic Mode Shapes of the Human Pinna , 2000 .

[52]  R. Duda,et al.  Approximating the head-related transfer function using simple geometric models of the head and torso. , 2002, The Journal of the Acoustical Society of America.

[53]  S. L. Marple,et al.  A tutorial overview of modern spectral estimation , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[54]  Daniel J Tollin,et al.  Spectral cues explain illusory elevation effects with stereo sounds in cats. , 2003, Journal of neurophysiology.