A dictionary learning approach to tracking

The problem of tracking people using multiple cameras is of much current interest as a means of providing cues for audio-visual blind source separation in dynamic environments. Here we investigate the use of one of the current state-of-the-art techniques in object recognition combined with one of the most popular methods of modelling object motion, particle filters, for tracking people. The dictionary learning or Bag-of-Words approach to object recognition has proved to be very effective in recent years, as shown in a number of large comparisons such as the PASCAL Visual Object recognition Challenge (VOC). In this paper we use this proven object recognition method within the framework of a particle filter. This provides a more accurate and robust tracking of people in a multiple camera environment. We also demonstrate that the dictionary learning approach can provide a principled method for the fusion of multiple features.

[1]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[2]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[3]  Jean-Marc Odobez,et al.  AV16.3: An Audio-Visual Corpus for Speaker Localization and Tracking , 2004, MLMI.

[4]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[5]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vincent Lepetit,et al.  Feature Harvesting for Tracking-by-Detection , 2006, ECCV.

[7]  Miao Yu,et al.  A Multimodal Approach to Blind Source Separation of Moving Sources , 2010, IEEE Journal of Selected Topics in Signal Processing.

[8]  Wen Gao,et al.  Online selecting discriminative tracking features using particle filter , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Aristodemos Pnevmatikakis,et al.  Real Time Audio-Visual Person Tracking , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[12]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Shaogang Gong,et al.  Tracking Multiple People Under Occlusion Using Multiple Cameras , 2000, BMVC.

[14]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[15]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).