Fusion of Multiple Camera Views for Kernel-Based 3D Tracking

We present a computer vision system to robustly track an object in 3D by combining evidence from multiple calibrated cameras. Its novelty lies in the proposed unified approach to 3D kernel based tracking, that amounts to fusing the appearance features from all available camera sensors, as opposed to tracking the object appearance in the individual 2D views and fusing the results. The elegance of the method resides in its inherent ability to handle problems encountered by various 2D trackers, including scale selection, occlusion, view-dependence, and correspondence across different views. We apply the method on the CHIL project database for tracking the presenter¿s head during lectures inside smart rooms equipped with four calibrated cameras. As compared to traditional 2D based mean shift tracking approaches, the proposed algorithm results in 35% relative reduction in overall 3D tracking error and a 70% reduction in the number of tracker re-initializations.

[1]  Thomas S. Huang,et al.  A Joint System for Person Tracking and Face Detection , 2005, ICCV-HCI.

[2]  John W. McDonough,et al.  An Audio-Visual Particle Filter for Speaker Tracking on the CLEAR'06 Evaluation Dataset , 2006, CLEAR.

[3]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[4]  Larry S. Davis,et al.  M2Tracker: A Multi-view Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo , 2002, ECCV.

[5]  Robert T. Collins,et al.  Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Jake K. Aggarwal,et al.  Object tracking in an outdoor environment using fusion of features and cameras , 2006, Image Vis. Comput..

[7]  Tim J. Ellis,et al.  Multi camera image tracking , 2006, Image Vis. Comput..

[8]  Kai She,et al.  Vehicle tracking using on-line fusion of color and shape features , 2004, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749).

[9]  Larry S. Davis,et al.  Joint Audio-Visual Tracking Using Particle Filters , 2002, EURASIP J. Adv. Signal Process..

[10]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Tieniu Tan,et al.  Real time hand tracking by combining particle filtering and mean shift , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[12]  Mubarak Shah,et al.  A Multiview Approach to Tracking People in Crowded Scenes Using a Planar Homography Constraint , 2006, ECCV.

[13]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[14]  Daniel P. Huttenlocher,et al.  Adaptive Bayesian recognition in tracking rigid objects , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[15]  Rómer Rosales,et al.  3D trajectory recovery for tracking multiple objects and trajectory guided recognition of actions , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[16]  James W. Davis,et al.  Multiview fusion for canonical view generation based on homography constraints , 2006, VSSN '06.