HRTF aided broadband doa estimation using two microphones

Two sensor broadband direction of arrival (DOA) estimation suffers from an inherent lack of dimensionality due to having just two sensors, yet humans and other animals are able to overcome this limitation using subtle variations introduced by the ears. Application of existing DOA estimation techniques to such systems becomes complicated due to the ill-behaved nature of the Head Related Transfer Function (HRTF). In this paper we present a subband signal extraction and focussing technique which retains the diversity information of the HRTF. We then develop a framework for combining these signals for subspace DOA estimation and investigate the constraints imposed on the single and multi-source DOA estimation problems. Finally, estimation performance is compared with existing techniques and we find performance has improved to be comparable to human localisation abilities.

[1]  W M Hartmann,et al.  Identification and localization of sound sources in the median sagittal plane. , 1999, The Journal of the Acoustical Society of America.

[2]  Richard R. Fay,et al.  Sound source localization , 2005 .

[3]  Bhaskar D. Rao,et al.  A Two Microphone-Based Approach for Source Localization of Multiple Speech Sources , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  B. Shinn-Cunningham,et al.  Tori of confusion: binaural localization cues for sources within reach of a listener. , 2000, The Journal of the Acoustical Society of America.

[5]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[6]  Chuck Hollingworth Electroreception (Springer Handbook of Auditory Research 21, series eds R.R. Fay and A.N. Popper) , 2005 .

[7]  Kazuhiro Iida,et al.  Upper hemisphere sound localization using head-related transfer functions in the median plane and interaural differences , 2003 .

[8]  Hong Wang,et al.  Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wide-band sources , 1985, IEEE Trans. Acoust. Speech Signal Process..

[9]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[10]  Simon Carlile,et al.  The nature and distribution of errors in sound localization by human listeners , 1997, Hearing Research.

[11]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[12]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[13]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[14]  Virginia Best,et al.  The role of high frequencies in speech localization. , 2005, The Journal of the Acoustical Society of America.

[15]  Davis Pan,et al.  A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[16]  Harald Viste,et al.  Binaural Source Localization by Joint Estimation of ILD and ITD , 2010, IEEE Transactions on Audio, Speech, and Language Processing.