Subspace selection to suppress confounding source domain information in AAM transfer learning

Active appearance models (AAMs) have seen tremendous success in face analysis. However, model learning depends on the availability of detailed annotation of canonical landmark points. As a result, when accurate AAM fitting is required on a different set of variations (expression, pose, identity), a new dataset is collected and annotated. To overcome the need for time consuming data collection and annotation, transfer learning approaches have received recent attention. The goal is to transfer knowledge from previously available datasets (source) to a new dataset (target). We propose a subspace transfer learning method, in which we select a subspace from the source that best describes the target space. We propose a metric to compute the directional similarity between the source eigenvectors and the target subspace. We show an equivalence between this metric and the variance of target data when projected onto source eigenvectors. Using this equivalence, we select a subset of source principal directions that capture the variance in target data. To define our model, we augment the selected source subspace with the target subspace learned from a handful of target examples. In experiments done on six public datasets, we show that our approach outperforms the state of the art in terms of the RMS fitting error as well as the percentage of test examples for which AAM fitting converges to the ground truth.

[1]  Stefanos Zafeiriou,et al.  Menpo: A Comprehensive Platform for Parametric Image Alignment and Visual Deformable Models , 2014, ACM Multimedia.

[2]  Stefanos Zafeiriou,et al.  A Unified Framework for Compositional Fitting of Active Appearance Models , 2016, International Journal of Computer Vision.

[3]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[4]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[5]  Ralph Gross,et al.  Generic vs. person specific active appearance models , 2005, Image Vis. Comput..

[6]  Z R Luo,et al.  Automated 3D segmentation of hippocampus based on active appearance model of brain MR images for the early diagnosis of Alzheimer's disease. , 2014, Minerva medica.

[7]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8]  Jeffrey F. Cohn,et al.  Painful data: The UNBC-McMaster shoulder pain expression archive database , 2011, Face and Gesture 2011.

[9]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[10]  David C. Hogg,et al.  Towards 3D hand tracking using a deformable model , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[11]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[12]  M. Younus Javed,et al.  A Survey on Sign Language Recognition , 2011, 2011 Frontiers of Information Technology.

[13]  Stefanos Zafeiriou,et al.  Bayesian Active Appearance Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.