Face and Body Association for Video-Based Face Recognition

In recent years face recognition has made extraordinary leaps, yet unconstrained video-based face identification in the wild remains an open and interesting problem. Videos, unlike still-images, offer a myriad of data for face modeling, sampling, and recognition, but, on the other hand, contain low-quality frames and motion blur. A key component in video-based face recognition is the way in which faces are associated through the video sequence before being used for recognition. In this paper, we present a video-based face recognition method taking advantage of face and body association (FBA). To track and associate subjects that appear across frames in multiple shots, we solve a data association problem using both face and body appearance. The final recovered track is then used to build a face representation for recognition. We evaluate our FBA method for video-based face recognition on a challenging dataset. Our experiments show up to 5% improvement in the identification rate over the state-of-the-art.

[1]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Bin Yang,et al.  Aggregate channel features for multi-view face detection , 2014, IEEE International Joint Conference on Biometrics.

[3]  Rama Chellappa,et al.  A deep pyramid Deformable Part Model for face detection , 2015, 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[4]  Junjie Yan,et al.  Face detection by structural models , 2014, Image Vis. Comput..

[5]  Andrew Zisserman,et al.  Automatic face recognition for film character retrieval in feature-length films , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[7]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Alberto Del Bimbo,et al.  Object Tracking by Oversampling Local Features , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  周鑫,et al.  Tracking-learning-detection (TLD)-based video object tracking method , 2012 .

[10]  Xiaoou Tang,et al.  Joint Face Representation Adaptation and Clustering in Videos , 2016, ECCV.

[11]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Louis-Philippe Morency,et al.  Convolutional Experts Constrained Local Model for 3D Facial Landmark Detection , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[13]  Carlos D. Castillo,et al.  Video-Based Face Association and Identification , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[14]  Shuicheng Yan,et al.  Toward Large-Population Face Identification in Unconstrained Videos , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Shengcai Liao,et al.  A Fast and Accurate Unconstrained Face Detector , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Yuan Li,et al.  Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ramakant Nevatia,et al.  Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation , 2017, BMVC.

[20]  Zhongli Wang,et al.  A survey on multiple object tracking algorithm , 2016, 2016 IEEE International Conference on Information and Automation (ICIA).

[21]  Mubarak Shah,et al.  Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jianguo Li,et al.  Learning SURF Cascade for Fast and Accurate Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Shifeng Zhang,et al.  S^3FD: Single Shot Scale-Invariant Face Detector , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Ramakant Nevatia,et al.  Beyond Pedestrians: A Hybrid Approach of Tracking Multiple Articulating Humans , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[25]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[26]  Shihong Lao,et al.  Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jiri Matas,et al.  Face-TLD: Tracking-Learning-Detection applied to faces , 2010, 2010 IEEE International Conference on Image Processing.

[28]  Alberto Del Bimbo,et al.  Person Re-Identification by Iterative Re-Weighted Sparse Ranking , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Zhen Qin,et al.  Improving multi-target tracking via social grouping , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tal Hassner,et al.  Rapid Synthesis of Massive Face Sets for Improved Face Recognition , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[32]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Tal Hassner,et al.  Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[35]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Alberto Del Bimbo,et al.  Matching People across Camera Views using Kernel Canonical Correlation Analysis , 2014, ICDSC.

[37]  Duy-Dinh Le,et al.  Scalable Face Track Retrieval in Video Archives Using Bag-of-Faces Sparse Representation , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Ramakant Nevatia,et al.  A multi-scale cascade fully convolutional network face detector , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[40]  Alberto Del Bimbo,et al.  Multichannel-Kernel Canonical Correlation Analysis for Cross-View Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[41]  Peiyun Hu,et al.  Finding Tiny Faces , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Matthew Q. Hill,et al.  Human and algorithm performance on the PaSC face Recognition Challenge , 2015, 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[43]  Jian Sun,et al.  Joint Cascade Face Detection and Alignment , 2014, ECCV.

[44]  Rama Chellappa,et al.  Unconstrained face verification using deep CNN features , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[45]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[46]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[47]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[48]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[51]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[52]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[54]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[55]  Bin Yang,et al.  Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[56]  Gérard G. Medioni,et al.  Pose-Aware Face Recognition in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Huaizu Jiang,et al.  Face Detection with the Faster R-CNN , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[58]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Wenhan Luo,et al.  Multiple object tracking: A literature review , 2014, Artif. Intell..

[60]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).