Target Tracking Using a Joint Acoustic Video System

In this paper, a multitarget tracking system for collocated video and acoustic sensors is presented. We formulate the tracking problem using a particle filter based on a state-space approach. We first discuss the acoustic state-space formulation whose observations use a sliding window of direction-of-arrival estimates. We then present the video state space that tracks a target's position on the image plane based on online adaptive appearance models. For the joint operation of the filter, we combine the state vectors of the individual modalities and also introduce a time-delay variable to handle the acoustic-video data synchronization issue, caused by acoustic propagation delays. A novel particle filter proposal strategy for joint state-space tracking is introduced, which places the random support of the joint filter where the final posterior is likely to lie. By using the Kullback-Leibler divergence measure, it is shown that the joint operation of the filter decreases the worst case divergence of the individual modalities. The resulting joint tracking filter is quite robust against video and acoustic occlusions due to our proposal strategy. Computer simulations are presented with synthetic and field data to demonstrate the filter's performance

[1]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[2]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[3]  Don H. Johnson,et al.  Array Signal Processing: Concepts and Techniques , 1993 .

[4]  Rama Chellappa,et al.  Visual tracking and recognition using appearance-adaptive models in particle filters , 2004, IEEE Transactions on Image Processing.

[5]  Trevor Darrell,et al.  Multiple person and speaker activity tracking with a particle filter , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[7]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[8]  Shahrokh Valaee,et al.  An information theoretic approach to source enumeration in array signal processing , 2004, IEEE Transactions on Signal Processing.

[9]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[10]  Volkan Cevher,et al.  Proposal strategies for joint state-space tracking with particle filters , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Volkan Cevher,et al.  General direction-of-arrival tracking with acoustic nodes , 2005, IEEE Transactions on Signal Processing.

[12]  Neil J. Gordon,et al.  Editors: Sequential Monte Carlo Methods in Practice , 2001 .

[13]  Stelios C. A. Thomopoulos,et al.  Distributed Fusion Architectures and Algorithms for Target Tracking , 1997, Proc. IEEE.

[14]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[15]  Yaakov Bar-Shalom,et al.  Sonar tracking of multiple targets using joint probabilistic data association , 1983 .

[16]  Rama Chellappa,et al.  Vehicle detection and tracking using acoustic and video sensors , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  E. Rivlin,et al.  A probabilistic framework for combining tracking algorithms , 2004, CVPR 2004.

[20]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[21]  Creasy Problem,et al.  Reference Posterior Distributions for Bayesian Inference , 1979 .

[22]  Brian D. Ripley,et al.  Stochastic Simulation , 2005 .

[23]  J. McClellan,et al.  An acoustic multiple target tracker , 2005, IEEE/SP 13th Workshop on Statistical Signal Processing, 2005.

[24]  Rama Chellappa,et al.  A generic approach to simultaneous tracking and verification in video , 2002, IEEE Trans. Image Process..

[25]  Jean-Marc Odobez,et al.  Audio-visual speaker tracking with importance particle filters , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[26]  Y. Bar-Shalom Tracking and data association , 1988 .

[27]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[28]  Volkan Cevher,et al.  Fast initialization of particle filters using a modified metropolis-Hastings algorithm: mode-hungry approach , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  A. Blake,et al.  Sequential Monte Carlo fusion of sound and vision for speaker tracking , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[30]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[31]  Henry Leung,et al.  Tracking the direction-of-arrival of multiple moving targets by passive arrays: algorithm , 1999, IEEE Trans. Signal Process..

[32]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .