Towards Robust Perception and Model Integration

Many of today's vision algorithms are very successful in controlled environments. Real-world environments, however, cannot be controlled and are most often dynamic with respect to illumination changes, motion, occlusions, multiple people, etc. Since most computer vision algorithms are limited to a particular situation they lack robustness in the context of dynamically changing environments. In this paper we argue that the integration of information coming from different visual cues and models is essential to increase robustness as well as generality of computer vision algorithms. Two examples are discussed where robustness of simple models is leveraged by cue and model integration. In the first example mutual information is used as a means to combine different object models for face detection without prior learning. The second example discusses experimental results on multi-cue tracking of faces based on the principles of self-organization of the integration mechanism and self-adaptation of the cue models during tracking.

[1]  C. Malsburg,et al.  Self-organized integration of adaptive visual cues for face tracking , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Henrik I. Christensen,et al.  A Model-Free Voting Approach for Integrating Multiple Cues , 1998, ECCV.

[3]  D. Ballard,et al.  Fast Temporal Dynamics of Visual Cue Integration , 2000, Perception.

[4]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  James J. Clark,et al.  Data Fusion for Sensory Information Processing Systems , 1990 .

[6]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994 .

[7]  Carsten G. Bräutigam A model-free voting approach to cue integration , 1998 .

[8]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[10]  Danica Kragic,et al.  Integration of visual cues for active tracking of an end-effector , 1999, Proceedings 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human and Environment Friendly Robots with High Intelligence and Emotional Quotients (Cat. No.99CH36289).

[11]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[12]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[13]  James L. Crowley,et al.  Multi-modal tracking of faces for video communications , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[16]  Gregory D. Hager,et al.  Incremental focus of attention for robust visual tracking , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Jitendra Malik,et al.  Learning Appearance Based Models: Mixtures of Second Moment Experts , 1996, NIPS.

[18]  Atsuto Maki,et al.  A computational model of depth-based attention , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[19]  Alex Pentland,et al.  LAFTER: lips and face real time tracker , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Atsuto Maki,et al.  Towards an active visual observer , 1995, Proceedings of IEEE International Conference on Computer Vision.

[21]  B. Parhami Voting algorithms , 1994 .

[22]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[23]  Tanzeem Choudhury,et al.  Multimodal person recognition using unconstrained audio and video , 1998 .