Face Recognition Based on Videos by Using Convex Hulls

A wide range of face appearance variations can be modeled by using set-based recognition approaches effectively, but computational complexity of current methods is highly dependent on the set and class sizes. This paper introduces new video-based classification methods designed for reducing the required disk space of data samples and speed up the testing process in large-scale face recognition systems. In the proposed method, image sets collected from videos are approximated with kernelized convex hulls and it was shown that it is sufficient to use only the samples that participate in shaping the image set boundaries in this setting. The kernelized support vector data description (SVDD) is used to extract those important samples that form the image set boundaries. Moreover, we show that these kernelized hypersphere models can also be used to approximate image sets for classification purposes. Then, we propose a binary hierarchical decision tree approach to improve the speed of the classification system even more. At last, we introduce a new video database that includes 285 people with 8 videos of each person, since the most popular video data sets used for set-based recognition methods include either a few people, or small number of videos per person. The experimental results on varying sized databases show that the proposed methods greatly improve the testing times of the classification system (we obtained speed-ups to a factor of 20) without a significant drop in accuracies.

[1]  Ajmal S. Mian,et al.  Image Set Based Face Recognition Using Self-Regularized Non-Negative Coding and Adaptive Distance Metric Learning , 2013, IEEE Transactions on Image Processing.

[2]  Mohammed Bennamoun,et al.  Deep Reconstruction Models for Image Set Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[4]  Ralph Gross,et al.  The CMU Motion of Body (MoBo) Database , 2001 .

[5]  Trevor Darrell,et al.  Face Recognition from Long-Term Observations , 2002, ECCV.

[6]  Shiguang Shan,et al.  Joint sparse representation for video-based face recognition , 2014, Neurocomputing.

[7]  Lei Zhang,et al.  Face recognition based on regularized nearest points between image sets , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[8]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Dongqing Zhang,et al.  Neural Aggregation Network for Video Face Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[11]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[12]  Hakan Cevikalp,et al.  Towards Large-Scale Face Recognition Based on Videos , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[13]  Jiwen Lu,et al.  Learning Discriminative Aggregation Network for Video-Based Face Recognition and Person Re-identification , 2017, International Journal of Computer Vision.

[14]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Pengfei Shi,et al.  Kernel Grassmannian distances and discriminant analysis for face recognition from image sets , 2009, Pattern Recognit. Lett..

[17]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[18]  Ken-ichi Maeda,et al.  Face recognition using temporal image sequence , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[19]  Ajmal S. Mian,et al.  Face Recognition Using Sparse Approximated Nearest Points between Image Sets , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[22]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[23]  Brian C. Lovell,et al.  Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching , 2011, CVPR 2011.

[24]  Matti Pietikäinen,et al.  From still image to video-based face recognition: an experimental analysis , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[25]  David J. Kriegman,et al.  Visual tracking and recognition using probabilistic appearance manifolds , 2005, Comput. Vis. Image Underst..

[26]  Masayuki Mukunoki,et al.  Collaboratively Regularized Nearest Points for Set Based Recognition , 2013, BMVC.

[27]  Shiguang Shan,et al.  Geometry-Aware Similarity Learning on SPD Manifolds for Visual Recognition , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Shuicheng Yan,et al.  Toward Large-Population Face Identification in Unconstrained Videos , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[31]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Shiguang Shan,et al.  A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database , 2015, IEEE Transactions on Image Processing.

[33]  Hakan Cevikalp,et al.  Face recognition based on image sets , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[35]  Yu Liu,et al.  Quality Aware Network for Set to Set Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Shiguang Shan,et al.  Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification , 2015, ICML.

[37]  Dit-Yan Yeung,et al.  Locally Linear Models on Face Appearance Manifolds with Application to Dual-Subspace Based Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Shiguang Shan,et al.  Prototype Discriminative Learning for Image Set Classification , 2017, IEEE Signal Processing Letters.

[39]  Brian C. Lovell,et al.  Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[41]  Rama Chellappa,et al.  Video-based face recognition via joint sparse representation , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[42]  Mohammed Bennamoun,et al.  Learning Non-linear Reconstruction Models for Image Set Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.