LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood

Modern face alignment methods have become quite accurate at predicting the locations of facial landmarks, but they do not typically estimate the uncertainty of their predicted locations nor predict whether landmarks are visible. In this paper, we present a novel framework for jointly predicting landmark locations, associated uncertainties of these predicted locations, and landmark visibilities. We model these as mixed random variables and estimate them using a deep network trained using our proposed Location, Uncertainty, and Visibility Likelihood (LUVLi) loss. In addition, we release an entirely new labeling of a large face alignment dataset with over 19,000 face images in a full range of head poses. Each face is manually labeled with the ground-truth locations of 68 landmarks, with the additional information of whether each landmarks is visible, self-occluded (due to extreme head poses), or externally occluded. Not only does our joint estimation yield accurate estimates of the uncertainty of predicted landmark locations, but it also yields state-of-the-art estimates for the landmark locations themselves on mulitple standard face alignment datasets. Our method's estimates of the uncertainty of predicted landmark locations could be used to automatically identify input images on which face alignment fails, which can be critical for downstream tasks.

[1]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[2]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[3]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[4]  Shaogang Gong,et al.  A Multi-View Nonlinear Active Shape Model Using Kernel PCA , 1999, BMVC.

[5]  Timothy F. Cootes,et al.  View-based active appearance models , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[6]  Jing Xiao,et al.  Fitting a Single Active Appearance Model Simultaneously to Multiple Images , 2004, BMVC.

[7]  Fatih Murat Porikli,et al.  Learning on lie groups for invariant detection and tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Mingrui Wu,et al.  Gradient descent optimization of smoothed information retrieval metrics , 2010, Information Retrieval.

[9]  Daijin Kim,et al.  Adaptive active appearance model with incremental learning , 2009, Pattern Recognit. Lett..

[10]  Xiaoming Liu,et al.  Discriminative Face Alignment , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Xiaoming Liu,et al.  Video-based face model fitting using Adaptive Active Appearance Model , 2010, Image Vis. Comput..

[12]  Pietro Perona,et al.  Cascaded pose regression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Michael J. Jones,et al.  Fully automatic pose-invariant face recognition via 3D pose normalization , 2011, 2011 International Conference on Computer Vision.

[14]  Timothy F. Cootes,et al.  Accurate Regression Procedures for Active Appearance Models , 2011, BMVC.

[15]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[16]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[17]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[18]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Maja Pantic,et al.  Optimization Problems for Fast AAM Fitting in-the-Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[25]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Charless C. Fowlkes,et al.  Occlusion Coherence: Detecting and Localizing Occluded Faces , 2015, ArXiv.

[29]  Georgios Tzimiropoulos,et al.  Project-Out Cascaded Regression with an application to face alignment , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Cheng Li,et al.  Unconstrained Face Alignment via Cascaded Compositional Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[33]  Jitendra Malik,et al.  Human Pose Estimation with Iterative Error Feedback , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[35]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  George Trigeorgis,et al.  Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[38]  Tim K Marks,et al.  Robust Face Alignment Using a Mixture of Invariant Experts , 2016, ECCV.

[39]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[40]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[41]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Xiaoming Liu,et al.  Monocular Video-Based Trailer Coupler Detection Using Multiplexer Convolutional Neural Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Andrew Zisserman,et al.  Recurrent Human Pose Estimation , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[44]  Wenyan Wu,et al.  Leveraging Intra and Inter-Dataset Variations for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[45]  George Trigeorgis,et al.  The Menpo Facial Landmark Localisation Challenge: A Step Towards the Solution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46]  Qingshan Liu,et al.  Stacked Hourglass Network for Robust Facial Landmark Localisation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[47]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[48]  William J. Christmas,et al.  Dynamic Attention-Controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-Set Sample Weighting , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Marek Kowalski,et al.  Deep Alignment Network: A Convolutional Neural Network for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[50]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[52]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[53]  Lisha Chen Kernel Density Network for Quantifying Regression Uncertainty in Face Alignment , 2018 .

[54]  Bernt Schiele,et al.  Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Yi Yang,et al.  Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Andrew Zisserman,et al.  Relaxed Softmax: Efficient Confidence Auto-Calibration for Safe Pedestrian Detection , 2018 .

[57]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]  Alois Knoll,et al.  Uncertainty Estimation for Deep Neural Object Detectors in Safety-Critical Applications , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[59]  Dimitris N. Metaxas,et al.  Quantized Densely Connected U-Nets for Efficient Landmark Localization , 2018, ECCV.

[60]  Klaus C. J. Dietmayer,et al.  Towards Safe Autonomous Driving: Capture Uncertainty in the Deep Neural Network For Lidar 3D Vehicle Detection , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[61]  Heng Huang,et al.  Direct Shape Regression Networks for End-to-End Face Alignment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[62]  Zhen He,et al.  Numerical Coordinate Regression with Convolutional Neural Networks , 2018, ArXiv.

[63]  Yi Yang,et al.  Style Aggregated Network for Facial Landmark Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[64]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[65]  Josef Kittler,et al.  Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Hassan Mansour,et al.  SparsePPG: Towards Driver Monitoring Using Camera-Based Vital Signs Estimation in Near-Infrared , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[67]  Stefanos Zafeiriou,et al.  The Menpo Benchmark for Multi-pose 2D and 3D Facial Landmark Localisation and Tracking , 2018, International Journal of Computer Vision.

[68]  Yici Cai,et al.  Look at Boundary: A Boundary-Aware Face Alignment Algorithm , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[69]  Thomas Brox,et al.  Uncertainty Estimates and Multi-hypotheses Networks for Optical Flow , 2018, ECCV.

[70]  Niko Sünderhauf,et al.  Benchmarking Sampling-based Probabilistic Object Detectors , 2019, CVPR Workshops.

[71]  Alexandre Alahi,et al.  MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[72]  Fuxin Li,et al.  Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[73]  Gang Yu,et al.  Rethinking on Multi-Stage Networks for Human Pose Estimation , 2019, ArXiv.

[74]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[75]  Ethan Fetaya,et al.  Evaluating and Calibrating Uncertainty Prediction in Regression Tasks , 2019, Sensors.

[76]  Ning Zhang,et al.  Laplace Landmark Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[77]  Yi Yang,et al.  Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[78]  Xiaoming Liu,et al.  Towards Highly Accurate and Stable Face Alignment for High-Resolution Videos , 2019, AAAI.

[79]  Jiahuan Zhou,et al.  Learning Robust Facial Landmark Detection via Hierarchical Structured Ensemble , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[80]  Divyansh Srivastava,et al.  Structured Aleatoric Uncertainty in Human Pose Estimation , 2019, CVPR Workshops.

[81]  Matthieu Cord,et al.  DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[82]  Jiaya Jia,et al.  Aggregation via Separation: Boosting Facial Landmark Detector With Semi-Supervised Style Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[83]  Marcus Liwicki,et al.  A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference , 2019, ArXiv.

[84]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[85]  Qiang Ji,et al.  Face Alignment With Kernel Density Deep Neural Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[86]  Xiangyu Zhang,et al.  Bounding Box Regression With Uncertainty for Accurate Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[87]  Shuyuan Yang,et al.  A Survey of Deep Learning-Based Object Detection , 2019, IEEE Access.

[88]  Kang Li,et al.  Towards Efficient U-Nets: A Coupled and Quantized Approach , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Steven L. Waslander,et al.  BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[90]  S. Kotz,et al.  AN ASYMMETRIC MULTIVARIATE LAPLACE DISTRIBUTION , 2022 .