Probabilistic Multimodality Fusion for Event based Home Photo Clustering

This paper presents a novel probabilistic approach to fusing multimodal metadata for event based home photo clustering. Photo events are characterized by the coherence of multimodality including time, content and camera settings. We incorporate these multimodal metadata into a unified probabilistic framework, in which event is taken as a latent semantic concept and discovered by fitting a generative model through an expectation-maximization (EM) algorithm. This approach is general and unsupervised, without any training procedure or predefined threshold. The experimental evaluations on 14 k photos taken by 10 amateur photographers have indicated the effectiveness and efficiency of the proposed framework in browsing and searching personal photo collections

[1]  Andreas Girgensohn,et al.  Temporal event clustering for digital photo collections , 2003, ACM Multimedia.

[2]  Xian-Sheng Hua,et al.  Video booklet , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[3]  Mary Czerwinski,et al.  PhotoTOC: automatic clustering for browsing personal photographs , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[4]  Edward Y. Chang,et al.  Multimodal metadata fusion using causal strength , 2005, ACM Multimedia.

[5]  Wei-Ying Ma,et al.  A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva , 2005, ICCV.

[6]  Andreas E. Savakis,et al.  Automatic image event segmentation and quality screening for albuming applications , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[7]  Jiebo Luo,et al.  Bayesian fusion of camera metadata cues in semantic scene classification , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Wei-Ying Ma,et al.  A probabilistic semantic model for image annotation and multi-modal image retrieval , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Bin Wang,et al.  A probabilistic model for retrospective news event detection , 2005, SIGIR '05.