论文信息 - Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context

Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context

We present a framework for vision-assisted tagging of personal photo collections using context. Whereas previous efforts mainly focus on tagging people, we develop a unified approach to jointly tag across multiple domains (specifically people, events, and locations). The heart of our approach is a generic probabilistic model of context that couples the domains through a set of cross-domain relations. Each relation models how likely the instances in two domains are to co-occur. Based on this model, we derive an algorithm that simultaneously estimates the cross-domain relations and infers the unknown tags in a semi-supervised manner. We conducted experiments on two well-known datasets and obtained significant performance improvements in both people and location recognition. We also demonstrated the ability to infer event labels with missing timestamps (i.e. with no event features).

Gang Hua | Ashish Kapoor | Dahua Lin | Simon Baker

[1] Simon King,et al. Towards context-aware face recognition , 2005, MULTIMEDIA '05.

[2] Leonidas J. Guibas,et al. The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[3] Dragomir Anguelov,et al. Contextual Identity Recognition in Personal Photo Albums , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Yang Song,et al. Context-Aided Human Recognition - Clustering , 2006, ECCV.

[5] Florian Schroff,et al. Clustering Videos by Location , 2009, BMVC.

[6] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[7] Gang Hua,et al. A robust elastic and partial matching metric for face recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8] Jiebo Luo,et al. Annotating collections of photos using hierarchical event and scene models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Antonio Torralba,et al. Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10] Fred Stentiford,et al. Using context and similarity for face and location identification , 2006, Electronic Imaging.

[11] Serge J. Belongie,et al. Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Antonio Torralba,et al. Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[13] Andrew McCallum,et al. An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[14] Mingjing Li,et al. Automated annotation of human faces in family albums , 2003, MULTIMEDIA '03.

[15] Tsuhan Chen,et al. Using a Markov Network to Recognize People in Consumer Images , 2007, 2007 IEEE International Conference on Image Processing.

[16] Tsuhan Chen,et al. Clothing cosegmentation for recognizing people , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Gang Hua,et al. Which faces to tag: Adding prior constraints into active learning , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18] Axel Pinz,et al. Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[19] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20] Tsuhan Chen,et al. Using Group Prior to Identify People in Consumer Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Andrea Vedaldi,et al. Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22] Martin J. Wainwright,et al. A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[23] Mor Naaman,et al. Leveraging context to resolve identity in photo albums , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[24] Tsuhan Chen,et al. Using Context to Recognize People in Consumer Images , 2009, IPSJ Trans. Comput. Vis. Appl..

[25] Yuandong Tian,et al. EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking , 2007, CHI.

[26] Li Fei-Fei,et al. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..