Face tracking and recognition with visual constraints in real-world videos

We address the problem of tracking and recognizing faces in real-world, noisy videos. We track faces using a tracker that adaptively builds a target model reflecting changes in appearance, typical of a video setting. However, adaptive appearance trackers often suffer from drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, our tracker introduces visual constraints using a combination of generative and discriminative models in a particle filtering framework. The generative term conforms the particles to the space of generic face poses while the discriminative one ensures rejection of poorly aligned targets. This leads to a tracker that significantly improves robustness against abrupt appearance changes and occlusions, critical for the subsequent recognition phase. Identity of the tracked subject is established by fusing pose-discriminant and person-discriminant features over the duration of a video sequence. This leads to a robust video-based face recognizer with state-of-the-art recognition performance. We test the quality of tracking and face recognition on real-world noisy videos from YouTube as well as the standard Honda/UCSD database. Our approach produces successful face tracking results on over 80% of all videos without video or person-specific parameter tuning. The good tracking performance induces similarly high recognition rates: 100% on Honda/UCSD and over 70% on the YouTube set containing 35 celebrities in 1500 sequences.

[1]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[2]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Shaogang Gong,et al.  Video-based online face recognition using identity surfaces , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[4]  Rama Chellappa,et al.  Probabilistic recognition of human faces from video , 2002, Proceedings. International Conference on Image Processing.

[5]  David J. Kriegman,et al.  Video-based face recognition using probabilistic appearance manifolds , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Rama Chellappa,et al.  Probabilistic recognition of human faces from video , 2002, Proceedings. International Conference on Image Processing.

[7]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Tsuhan Chen,et al.  Video-based face recognition using adaptive hidden Markov models , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Gregory D. Hager,et al.  Multiple kernel tracking with SSD , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[13]  Christophe Garcia,et al.  Enhancing face recognition from video sequences using robust statistics , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[14]  D.O. Gorodnichy,et al.  Associative neural networks as means for low-resolution video-based recognition , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[15]  David J. Kriegman,et al.  Visual tracking and recognition using probabilistic appearance manifolds , 2005, Comput. Vis. Image Underst..

[16]  Roberto Cipolla,et al.  Face Recognition from Video Using the Generic Shape-Illumination Manifold , 2006, ECCV.

[17]  Zhifeng Li,et al.  Spatio-temporal Embedding for Statistical Face Recognition from Video , 2006, ECCV.

[18]  Yuan Li,et al.  Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Xiaoqin Zhang,et al.  Graph Based Discriminative Learning for Robust and Efficient Object Tracking , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Yuan Li,et al.  Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans , 2007, CVPR.

[21]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[22]  Johannes Stallkamp,et al.  Video-based Face Recognition on Real-World Data , 2007, 2007 IEEE 11th International Conference on Computer Vision.