Towards reliable real-time multiview tracking

We address the problem of reliable real-time 3D-tracking of multiple objects which are observed in multiple wide-baseline camera views. Establishing the spatio-temporal correspondence is a problem with combinatorial complexity in the number of objects and views. In addition vision based tracking suffers from the ambiguities introduced by occlusion, clutter and irregular 3D motion. We present a discrete relaxation algorithm for reducing the intrinsic combinatorial complexity by pruning the decision tree based on unreliable prior information from independent 2D-tracking for each view. The algorithm improves the reliability of spatio-temporal correspondence by simultaneous optimisation over multiple views in the case where 2D-tracking in one or more views is ambiguous. Application to the 3D reconstruction of human movement, based on tracking of skin-coloured regions in three views, demonstrates considerable improvement in reliability and performance. The results demonstrate that the optimisation over multiple views gives correct 3D reconstruction and object labeling in the presence of incorrect 2D-tracking whilst maintaining real-time performance.

[1]  Y. Bar-Shalom Tracking and data association , 1988 .

[2]  Zhengyou Zhang,et al.  Token tracking in a cluttered scene , 1994, Image Vis. Comput..

[3]  Thomas S. Huang,et al.  Maximal matching of 3-D points for multiple-object motion estimation , 1988, Pattern Recognit..

[4]  Narendra Ahuja,et al.  Motion and structure from point correspondences with error estimation: planar surfaces , 1991, IEEE Trans. Signal Process..

[5]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Olivier Faugeras,et al.  3D Dynamic Scene Analysis , 1992 .

[7]  Takeo Kanade,et al.  Introduction to the Special Section on Video Surveillance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Dmitry Chetverikov,et al.  Tracking feature points: a new algorithm , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[9]  Thomas S. Huang,et al.  Motion and structure from feature correspondences: a review , 1994, Proc. IEEE.

[10]  John K. Tsotsos,et al.  Applying temporal constraints to the dynamic stereo problem , 1986, Comput. Vis. Graph. Image Process..

[11]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Ishwar K. Sethi,et al.  Feature Point Correspondence in the Presence of Occlusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ingemar J. Cox,et al.  A maximum likelihood N-camera stereo algorithm , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[14]  V. S. Hwang,et al.  Tracking feature points in time-varying images using an opportunistic selection approach , 1989, Pattern Recognit..

[15]  Olivier D. Faugeras,et al.  Improving Consistency and Reducing Ambiguity in Stochastic Labeling: An Optimization Approach , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Johan Philip Estimation Three-Dimensional Motion of Rigid Objects from Noisy Observations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..