Particle filtering for bearing-only audio-visual speaker detection and tracking

We present a method for audio-visual speaker detection and tracking in a smart meeting room environment based on bearing measurements and particle filtering. Bearing measurements are determined using the Time Difference of Arrival (TDOA) of the acoustic signal reaching a pair of microphones, and by tracking facial regions in images from monocular cameras. A particle filter is used to sample the space of possible speaker locations within the meeting room, and to fuse the bearing measurements from auditory and visual sources. The proposed system was tested in a video messaging scenario, using a single participant seated in front of a screen to which a camera and microphone pair are attached. The experimental results show that the accuracy of speaker tracking using bearing measurements is related to the location of the speaker relative to the locations of the camera and microphones, which can be quantified using a parameter known as Dilution of Precision.

[1]  S. L. Phung,et al.  A novel skin color model in YCbCr color space and its application to human face detection , 2002, Proceedings. International Conference on Image Processing.

[2]  Yong Rui,et al.  Real-time speaker tracking using particle filter sensor fusion , 2004, Proceedings of the IEEE.

[3]  James J. Spilker,et al.  Satellite Constellation and Geometric Dilution of Precision , 2009 .

[4]  Jean-Marc Odobez,et al.  Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Dorin Comaniciu,et al.  Mean shift and optimal prediction for efficient object tracking , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[6]  Darren B. Ward,et al.  Particle filtering algorithms for tracking an acoustic source in a reverberant environment , 2003, IEEE Trans. Speech Audio Process..

[7]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[8]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .