Quality Based Frame Selection for Face Clustering in News Video

Clustering identities in a broadcast video is a useful task to aid in video annotation and retrieval. Quality based frame selection is a crucial task in video face clustering, to both improve the clustering performance and reduce the computational cost. We present a frame work that selects the highest quality frames available in a video to cluster the face. This frame selection technique is based on low level and high level features (face symmetry, sharpness, contrast and brightness) to select the highest quality facial images available in a face sequence for clustering. We also consider the temporal distribution of the faces to ensure that selected faces are taken at times distributed throughout the sequence. Normalized feature scores are fused and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face clustering system. We present a news video database to evaluate the clustering system performance. Experiments on the newly created news database show that the proposed method selects the best quality face images in the video sequence, resulting in improved clustering performance.

[1]  Enrique Argones-Rúa,et al.  Quality-Based Score Normalization and Frame Selection for Video-Based Person Authentication , 2008, BIOID.

[2]  Sridha Sridharan,et al.  Quality based frame selection for video face recognition , 2012, 2012 6th International Conference on Signal Processing and Communication Systems.

[3]  Sébastien Marcel,et al.  Session variability modelling for face authentication , 2013, IET Biom..

[4]  Stan Z. Li,et al.  Standardization of Face Image Sample Quality , 2007, ICB.

[5]  Javier Lorenzo-Navarro,et al.  Face and Facial Feature Detection Evaluation - Performance Evaluation of Public Domain Haar Detectors for Face and Facial Feature Detection , 2008, VISAPP.

[6]  Carmen García Mateo,et al.  Quality-Based Score Normalization and Frame Selection for Video-Based Person Authentication , 2008 .

[7]  Luhong Liang,et al.  A detector tree of boosted classifiers for real-time object detection and tracking , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[8]  Dominique Fohr,et al.  Speaker diarization using normalized cross likelihood ratio , 2007, INTERSPEECH.

[9]  Patrick J. Flynn,et al.  Detecting questionable observers using face track clustering , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[10]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[12]  Langis Gagnon,et al.  Automatic Detection and Clustering of Actor Faces based on Spectral Clustering Techniques , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[13]  Hyung Jin Chang,et al.  Fast incremental learning for one-class support vector classifier using sample margin information , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  Sridha Sridharan,et al.  Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  John See,et al.  Exemplar extraction using spatio-temporal hierarchical agglomerative clustering for face recognition in video , 2011, 2011 International Conference on Computer Vision.

[16]  Wen Gao,et al.  Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  David J. Kriegman,et al.  Visual tracking and recognition using probabilistic appearance manifolds , 2005, Comput. Vis. Image Underst..

[18]  Sridha Sridharan,et al.  Extending the Task of Diarization to Speaker Attribution , 2011, INTERSPEECH.

[19]  Ming Shao,et al.  A New Method for Multi-view Face Clustering in Video Sequence , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[20]  Prithwijit Guha,et al.  The Video Face Book , 2012, MMM.

[21]  Thomas B. Moeslund,et al.  Summarization of Surveillance Video Sequences Using Face Quality Assessment , 2011, Int. J. Image Graph..

[22]  Philippe Joly,et al.  Face-and-clothing based people clustering in video content , 2010, MIR '10.

[23]  Ashis Kumar Dhara,et al.  Performance metrics for image contrast , 2011, 2011 International Conference on Image Information Processing.