Correspondence driven adaptation for human profile recognition

Visual recognition systems for videos using statistical learning models often show degraded performance when being deployed to a real-world environment, primarily due to the fact that training data can hardly cover sufficient variations in reality. To alleviate this issue, we propose to utilize the object correspondences in successive frames as weak supervision to adapt visual recognition models, which is particularly suitable for human profile recognition. Specifically, we substantialize this new strategy on an advanced convolutional neural network (CNN) based system to estimate human gender, age, and race. We enforce the system to output consistent and stable results on face images from the same trajectories in videos by using incremental stochastic training. Our baseline system already achieves competitive performance on gender and age estimation as compared to the state-of-the-art algorithms on the FG-NET database. Further, on two new video datasets containing about 900 persons, the proposed supervision of correspondences improves the estimation accuracy by a large margin over the baseline.

[1]  Yun Fu,et al.  Human age estimation using bio-inspired features , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Ming-Hsuan Yang,et al.  Gender classification with support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Ming-Hsuan Yang,et al.  Incremental Learning for Visual Tracking , 2004, NIPS.

[4]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ivan Laptev,et al.  Semi-supervised Learning of Facial Attributes in Video , 2010, ECCV Workshops.

[6]  Yun Fu,et al.  Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression , 2008, IEEE Transactions on Image Processing.

[7]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ming Liu,et al.  Regression from patch-kernel , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Yun Fu,et al.  Human Age Estimation With Regression on Discriminative Aging Manifold , 2008, IEEE Transactions on Multimedia.

[10]  C. Christodoulou,et al.  Comparing different classifiers for automatic age estimation , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Zhi-Hua Zhou,et al.  Automatic Age Estimation Based on Facial Aging Patterns , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Horst Bischof,et al.  Real-Time Tracking via On-line Boosting , 2006, BMVC.

[14]  Hongyuan Zha,et al.  Learning distance metric for regression by semidefinite programming with application to human age estimation , 2009, ACM Multimedia.

[15]  Mei Han,et al.  An algorithm for multiple object trajectory tracking , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Shuicheng Yan,et al.  Learning Auto-Structured Regressor from Uncertain Nonnegative Labels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[18]  Ming-Hsuan Yang,et al.  Adaptive Probabilistic Visual Tracking with Incremental Subspace Update , 2004, ECCV.

[19]  Niels da Vitoria Lobo,et al.  Age Classification from Facial Images , 1999, Comput. Vis. Image Underst..

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Rama Chellappa,et al.  Face Verification Across Age Progression , 2006, IEEE Transactions on Image Processing.

[22]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[24]  Terrence J. Sejnowski,et al.  SEXNET: A Neural Network Identifies Sex From Human Faces , 1990, NIPS.

[25]  David J. Kriegman,et al.  Online learning of probabilistic appearance manifolds for video-based recognition and tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Ming Yang,et al.  Detection driven adaptive multi-cue integration for multiple human tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[28]  Rong Yan,et al.  A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification , 2006, IEEE Trans. Pattern Anal. Mach. Intell..