Capturing Spatial and Temporal Patterns for Facial Landmark Tracking through Adversarial Learning

The spatial and temporal patterns inherent in facial feature points are crucial for facial landmark tracking, but have not been thoroughly explored yet. In this paper, we propose a novel deep adversarial framework to explore the shape and temporal dependencies from both appearance level and target label level. The proposed deep adversarial framework consists of a deep landmark tracker and a discriminator. The deep landmark tracker is composed of a stacked Hourglass network as well as a convolutional neural network and a long short-term memory network, and thus implicitly capture spatial and temporal patterns from facial appearance for facial landmark tracking. The discriminator is adopted to distinguish the tracked facial landmarks from ground truth ones. It explicitly models shape and temporal dependencies existing in ground truth facial landmarks through another convolutional neural network and another long short-term memory network. The deep landmark tracker and the discriminator compete with each other. Through adversarial learning, the proposed deep adversarial landmark tracking approach leverages inherent spatial and temporal patterns to facilitate facial landmark tracking from both appearance level and target label level. Experimental results on two benchmark databases demonstrate the superiority of the proposed approach to state-of-the-art work.

[1]  Heng Huang,et al.  Direct Shape Regression Networks for End-to-End Face Alignment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Stefanos Zafeiriou,et al.  A Comprehensive Performance Evaluation of Deformable Face Tracking “In-the-Wild” , 2016, International Journal of Computer Vision.

[4]  Rogério Schmidt Feris,et al.  A Recurrent Encoder-Decoder Network for Sequential Face Alignment , 2016, ECCV.

[5]  Stefanos Zafeiriou,et al.  The First Facial Landmark Tracking in-the-Wild Challenge: Benchmark and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[6]  Tara N. Sainath,et al.  Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Qiang Ji,et al.  Facial Landmark Detection: A Literature Survey , 2018, International Journal of Computer Vision.

[8]  Qiang Ji,et al.  Shape Augmented Regression Method for Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[9]  Xiaoming Liu,et al.  Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos , 2019, AAAI.

[10]  Qiang Ji,et al.  Simultaneous Facial Feature Tracking and Facial Expression Recognition , 2013, IEEE Transactions on Image Processing.

[11]  Ashraf A. Kassim,et al.  Facial Landmark Detection via Progressive Initialization , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[12]  Jiwen Lu,et al.  Two-Stream Transformer Networks for Video-Based Face Alignment , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[15]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Müjdat Çetin,et al.  A graphical model based solution to the facial feature point tracking problem , 2011, Image Vis. Comput..

[17]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..