ATF: Towards Robust Face Alignment via Leveraging Similarity and Diversity across Different Datasets

Face alignment is an important task in the field of multi-media. Together with the impressive progress of algorithms, various benchmark datasets have been released in recent years. Intuitively, it is meaningful to integrate multiple labeled datasets with different annotations to achieve higher performance on a target landmark detector. Although numerous efforts have been made in joint usage, there yet remain three shortages in recent works, e.g., additional computation, limitation of the markups scheme, and limited support for the regression method. To address the above problems, we proposed a novel Alternating Training Framework (ATF), which leverages similarity and diversity across multi-media sources for a more robust detector. Our framework mainly contains two sub-modules: Alternating Training with Decreasing Proportions (ATDP) and Mixed Branch Loss (mathcal LMB). In particular, ATDP trains multiple datasets simultaneously to take advantage of the diversity between them, while mathcal LMB utilizes similar landmark pairs to constrain different branches of corresponding datasets. Extensive experiments on various benchmarks show the effectiveness of our framework, and ATF is feasible for both heatmap-based network and direct coordinate regression. Specifically, the mean error even reaches 3.17 on the experiment on 300W leveraging WFLW, which significantly outperforms state-of-the-art methods. Both in an ordinary convolutional network (OCN) and HRNET, ATF achieves up to 9.96% relative improvement. Our source codes are made publicly available at https://github.com/starhiking/ATF.

[1]  Marek Kowalski,et al.  Deep Alignment Network: A Convolutional Neural Network for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[3]  Sina Honari,et al.  Improving Landmark Localization with Semi-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Hanjiang Lai,et al.  Robust Facial Landmark Detection via Recurrent Attentive-Refinement Networks , 2016, ECCV.

[6]  José Miguel Buenaposada,et al.  A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment , 2018, ECCV.

[7]  Matthieu Cord,et al.  DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Cheng Li,et al.  Unconstrained Face Alignment via Cascaded Compositional Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[10]  Jiahuan Zhou,et al.  Learning Robust Facial Landmark Detection via Hierarchical Structured Ensemble , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Lin Ma,et al.  PFLD: A Practical Facial Landmark Detector , 2019, ArXiv.

[14]  William J. Christmas,et al.  Dynamic Attention-Controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-Set Sample Weighting , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xi Zhou,et al.  Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network , 2018, ECCV.

[16]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[17]  Li Zhang,et al.  Collaborative Facial Landmark Localization for Transferring Annotations Across Datasets , 2014, ECCV.

[18]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Yici Cai,et al.  Look at Boundary: A Boundary-Aware Face Alignment Algorithm , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Shiguang Shan,et al.  Leveraging Datasets with Varying Annotations for Face Alignment via Deep Regression Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[22]  Yi Yang,et al.  Style Aggregated Network for Facial Landmark Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[24]  Georgios Tzimiropoulos,et al.  Synergy between Face Alignment and Tracking via Discriminative Global Consensus Optimization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Rama Chellappa,et al.  Disentangling 3D Pose in a Dendritic CNN for Unconstrained 2D Face Alignment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Sina Honari,et al.  Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[28]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[29]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Yi Yang,et al.  Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Cheng Li,et al.  Transferring Landmark Annotations for Cross-Dataset Face Alignment , 2014, ArXiv.

[32]  Mingjie Zheng,et al.  Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[34]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[35]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Wenyan Wu,et al.  Leveraging Intra and Inter-Dataset Variations for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Ning Zhang,et al.  Laplace Landmark Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38]  Jiaya Jia,et al.  Aggregation via Separation: Boosting Facial Landmark Detector With Semi-Supervised Style Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[40]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[41]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[43]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[44]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Josef Kittler,et al.  Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.