Context-Aided Human Recognition - Clustering

Context information other than faces, such as clothes, picture-taken-time and some logical constraints, can provide rich cues for recognizing people. This aim of this work is to automatically cluster pictures according to person's identity by exploiting as much context information as possible in addition to faces. Toward that end, a clothes recognition algorithm is first developed, which is effective for different types of clothes (smooth or highly textured). Clothes recognition results are integrated with face recognition to provide similarity measurements for clustering. Picture-taken-time is used when combining faces and clothes, and the cases of faces or clothes missing are handled in a principle way. A spectral clustering algorithm which can enforce hard constraints (positive and negative) is presented to incorporate logic-based cues (e.g. two persons in one picture must be different individuals) and user feedback. Experiments on real consumer photos show the effectiveness of the algorithm.

[1]  Thomas Leung,et al.  Texton Correlation for Recognition , 2004, ECCV.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Sergey Ioffe,et al.  Red eye detection with machine learning , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[4]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Tai Sing Lee,et al.  Computational models of perceptual organization , 2003 .

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[13]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Sergey Ioffe,et al.  Probabilistic Linear Discriminant Analysis , 2006, ECCV.

[16]  Jianbo Shi,et al.  Grouping with Bias , 2001, NIPS.

[17]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[18]  Mingjing Li,et al.  Automated annotation of human faces in family albums , 2003, MULTIMEDIA '03.