Kalman Filters for Time Delay of Arrival-Based Source Localization

In this work, we propose an algorithm for acoustic source localization based on time delay of arrival (TDOA) estimation. In earlier work by other authors, an initial closed-form approximation was first used to estimate the true position of the speaker followed by a Kalman filtering stage to smooth the time series of estimates. In the proposed algorithm, this closed-form approximation is eliminated by employing a Kalman filter to directly update the speaker's position estimate based on the observed TDOAs. In particular, the TDOAs comprise the observation associated with an extended Kalman filter whose state corresponds to the speaker's position. We tested our algorithm on a data set consisting of seminars held by actual speakers. Our experiments revealed that the proposed algorithm provides source localization accuracy superior to the standard spherical and linear intersection techniques. Moreover, the proposed algorithm, although relying on an iterative optimization scheme, proved efficient enough for real-time operation.

[1]  Michael S. Brandstein,et al.  A closed-form location estimator for use with room environment microphone arrays , 1997, IEEE Trans. Speech Audio Process..

[2]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[3]  Sharon Gannot,et al.  SPEAKER LOCALIZATION EXPLOITING SPATIAL-TEMPORAL INFORMATION , 2003 .

[4]  Rainer Stiefelhagen,et al.  Towards vision-based 3-D people tracking in a smart room , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[5]  H.K. Ekenel,et al.  Kalman filters for audio-video source localization , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[6]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[7]  Darren B. Ward,et al.  Particle filtering algorithms for tracking an acoustic source in a reverberant environment , 2003, IEEE Trans. Speech Audio Process..

[8]  Maurizio Omologo,et al.  Acoustic event localization using a crosspower-spectrum phase based technique , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Julius O. Smith,et al.  Closed-form least-squares source location estimation from range-difference measurements , 1987, IEEE Trans. Acoust. Speech Signal Process..

[10]  J. Smith,et al.  The spherical interpolation method for closed-form passive source localization using range difference measurements , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  K. C. Ho,et al.  A simple and efficient estimator for hyperbolic location , 1994, IEEE Trans. Signal Process..

[12]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Jacob Benesty,et al.  Robust time delay estimation exploiting redundancy among multiple microphones , 2003, IEEE Trans. Speech Audio Process..

[14]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[15]  Maurizio Omologo,et al.  Use of a CSP-based voice activity detector for distant-talking ASR , 2003, INTERSPEECH.

[16]  Steven A. Tretter,et al.  Optimum processing for delay-vector estimation in passive signal arrays , 1973, IEEE Trans. Inf. Theory.

[17]  Larry S. Davis,et al.  Multimodal 3-D tracking and event detection via the particle filter , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[18]  Darren B. Ward,et al.  Experimental comparison of particle filtering algorithms for acoustic source localization in a reverberant room , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[19]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[20]  Golub Gene H. Et.Al Matrix Computations, 3rd Edition , 2007 .

[21]  Simon Haykin,et al.  Adaptive Filter Theory 4th Edition , 2002 .

[22]  T. Kailath,et al.  A state-space approach to adaptive RLS filtering , 1994, IEEE Signal Processing Magazine.

[23]  Michael Shapiro Brandstein,et al.  A framework for speech source localization using sensor arrays , 1995 .

[24]  Sascha Spors,et al.  Joint Audio-Video Signal Processing for Object Localization and Tracking , 2001, Microphone Arrays.

[25]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[26]  H. C. Schau,et al.  Passive source localization employing intersecting spherical surfaces from time-of-arrival differences , 1987, IEEE Trans. Acoust. Speech Signal Process..

[27]  Gene H. Golub,et al.  Matrix computations , 1983 .

[28]  Jacob Benesty,et al.  Real-time passive source localization: a practical linear-correction least-squares approach , 2001, IEEE Trans. Speech Audio Process..

[29]  Greg Welch,et al.  SCAAT: incremental tracking with incomplete information , 1997, SIGGRAPH.