Video face matching using subset selection and clustering of probabilistic Multi-Region Histograms

Balancing computational efficiency with recognition accuracy is one of the major challenges in real-world video-based face recognition. A significant design decision for any such system is whether to process and use all possible faces detected over the video frames, or whether to select only a few ‘best’ faces. This paper presents a video face recognition system based on probabilistic Multi-Region Histograms to characterise performance trade-offs in: (i) selecting a subset of faces compared to using all faces, and (ii) combining information from all faces via clustering. Three face selection metrics are evaluated for choosing a subset: face detection confidence, random subset, and sequential selection. Experiments on the recently introduced MOBIO dataset indicate that the usage of all faces through clustering always outperformed selecting only a subset of faces. The experiments also show that the face selection metric based on face detection confidence generally provides better recognition performance than random or sequential sampling. Moreover, the optimal number of faces varies drastically across selection metric and subsets of MOBIO. Given the trade-offs between computational effort, recognition accuracy and robustness, it is recommended that face feature clustering would be most advantageous in batch processing (particularly for video-based watchlists), whereas face selection methods should be limited to applications with significant computational restrictions.

[1]  Yunhong Wang,et al.  Video-based Face Recognition: A Survey , 2009 .

[2]  Chi-Ho Chan,et al.  On the Results of the First Mobile Biometry (MOBIO) Face and Speaker Verification Evaluation , 2010, ICPR Contests.

[3]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[4]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Jean-Luc Dugelay,et al.  Person recognition using facial video information: A state of the art , 2009, J. Vis. Lang. Comput..

[6]  Dmitry O. Gorodnichy,et al.  On importance of nose for face tracking , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[7]  Caifeng Shan,et al.  Face Recognition and Retrieval in Video , 2010, Video Search and Mining.

[8]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[9]  Brian C. Lovell,et al.  Multi-Region Probabilistic Histograms for Robust and Scalable Identity Inference , 2009, ICB.

[10]  Erik G. Learned-Miller,et al.  Unsupervised Joint Alignment of Complex Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[12]  Christophe Garcia,et al.  Enhancing face recognition from video sequences using robust statistics , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[13]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[14]  Roberto Cipolla,et al.  Face Set Classification using Maximally Probable Mutual Modes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Trevor Darrell,et al.  Face recognition with image sets using manifold density divergence , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  David J. Kriegman,et al.  Video-based face recognition using probabilistic appearance manifolds , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[18]  Roberto Paredes,et al.  Simultaneous learning of a discriminative projection and prototypes for Nearest-Neighbor classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.