Towards Robust Multi-cue Integration for Visual Tracking

Even though many of today's vision algorithms are very successful, they lack robustness since they are typically limited to a particular situation. In this paper we argue that the principles of sensor and model integration can increase the robustness of today's computer vision systems substantially. As an example multi-cue tracking of faces is discussed. The approach is based on the principles of self-organization of the integration mechanism and self-adaptation of the cue models during tracking. Experiments show that the robustness of simple models is leveraged significantly by sensor and model integration.

[1]  Jochen Triesch Self-organized integration of adaptive visual cues for face tracking , 2000, SPIE Defense + Commercial Sensing.

[2]  Henrik I. Christensen,et al.  A Model-Free Voting Approach for Integrating Multiple Cues , 1998, ECCV.

[3]  D. Ballard,et al.  Fast Temporal Dynamics of Visual Cue Integration , 2000, Perception.

[4]  Danica Kragic,et al.  Integration of visual cues for active tracking of an end-effector , 1999, Proceedings 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human and Environment Friendly Robots with High Intelligence and Emotional Quotients (Cat. No.99CH36289).

[5]  C. Malsburg,et al.  Self-organized integration of adaptive visual cues for face tracking , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[6]  James L. Crowley,et al.  Multi-modal tracking of faces for video communications , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Michael Isard,et al.  Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion , 2000 .

[8]  B. Parhami Voting algorithms , 1994 .

[9]  Carsten G. Bräutigam A model-free voting approach to cue integration , 1998 .

[10]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[11]  Ulf Grenander,et al.  Hands: A Pattern Theoretic Study of Biological Shapes , 1990 .

[12]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[13]  James J. Clark,et al.  Data Fusion for Sensory Information Processing Systems , 1990 .

[14]  Gregory D. Hager,et al.  Incremental focus of attention for robust visual tracking , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .