Multimodal latent topic analysis for image collection summarization

We present a new multimodal image collection summarization method.The summarization method is based on latent topic analysis.Textual and visual modalities are fused in the same latent space using convex non-negative matrix factorization.The obtained multimodal summarization involves textual and visual informations.We evaluate the proposed method using reconstruction error and summary diversity. This paper presents a multimodal latent topic analysis method for the construction of image collection summaries. The method automatically selects a set of prototypical images from a large set of retrieved images for a given query. We define an image collection summary as a subset of images from a collection, which is visually and semantically representative. To build such a summary we propose MICS (Multimodal Image Collection Summarization), a method that combines textual and visual modalities in a common latent space, which allows to find a subset of images from which the whole collection can be reconstructed. Experiments were conducted on two collections of tagged images demonstrating the ability of the approach to build summaries with representative visual and semantic contents. The method was evaluated using objective measures, reconstruction error and diversity of the summary, showing competitive results when compared to other summarization approaches.

[1]  Tao Qin,et al.  Web image clustering by consistent utilization of visual features and surrounding texts , 2005, MULTIMEDIA '05.

[2]  Shumeet Baluja,et al.  Canonical image selection from the web , 2007, CIVR '07.

[3]  Hao Xu,et al.  Hybrid image summarization , 2011, ACM Multimedia.

[4]  Nenghai Yu,et al.  Flickr Distance: A Relationship Measure for Visual Concepts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[6]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[7]  Jeremiah D. Deng Content-based image collection summarization and comparison using self-organizing maps , 2007, Pattern Recognit..

[8]  Michael Elad,et al.  K-SVD : DESIGN OF DICTIONARIES FOR SPARSE REPRESENTATION , 2005 .

[9]  Youssef Hadi,et al.  Video summarization by k-medoid clustering , 2006, SAC '06.

[10]  Yong Jae Lee,et al.  AverageExplorer: interactive exploration and alignment of visual data collections , 2014, ACM Trans. Graph..

[11]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[12]  Hervé Glotin,et al.  Diversifying Image Retrieval with Affinity-Propagation Clustering on Visual Manifolds , 2009, IEEE MultiMedia.

[13]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[14]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[15]  Ye Zhao,et al.  Visual summarization of image collections by fast RANSAC , 2016, Neurocomputing.

[16]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[17]  Don H. Johnson,et al.  Symmetrizing the Kullback-Leibler Distance , 2001 .

[18]  Jonathon S. Hare,et al.  Efficient clustering and quantisation of SIFT features: exploiting characteristics of the SIFT descriptor and interest region detectors under image inversion , 2011, ICMR '11.

[19]  Pavel Zezula,et al.  CLAN Photo Presenter: Multi-modal Summarization Tool for Image Collections , 2014, ICMR.

[20]  Jianping Fan,et al.  Image collection summarization via dictionary learning for sparse representation , 2013, Pattern Recognit..

[21]  Federico Lecumberry,et al.  Pattern Recognition in Latin America in the "Big Data" Era , 2015, Pattern Recognit..

[22]  Jianping Fan,et al.  A novel approach to enable semantic and visual image summarization for exploratory image search , 2008, MIR '08.

[23]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jiawei Han,et al.  Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling , 2012, TIST.

[25]  Claudio Carpineto,et al.  Evaluating subtopic retrieval methods: Clustering versus diversification of search results , 2012, Inf. Process. Manag..

[26]  Ishwar K. Sethi,et al.  eID: a system for exploration of image databases , 2003, Inf. Process. Manag..

[27]  Svetlana Lazebnik,et al.  Computing iconic summaries of general visual concepts , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  H. Nobuhara,et al.  A lattice structure visualization by formal concept analysis and its application to huge image database , 2007, 2007 IEEE/ICME International Conference on Complex Medical Engineering.

[29]  Meng Wang,et al.  Optimizing social image search with multiple criteria: Relevance, diversity, and typicality , 2012, Neurocomputing.

[30]  Michele Covell,et al.  Comparison of clustering approaches for summarizing large populations of images , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[31]  Xian-Sheng Hua,et al.  Interactive browsing via diversified visual summarization for image search results , 2011, Multimedia Systems.

[32]  Xinmei Tian,et al.  Multi-modal and multi-scale photo collection summarization , 2015, Multimedia Tools and Applications.

[33]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[34]  Antonella De Angeli,et al.  A Hybrid Machine-Crowd Approach to Photo Retrieval Result Diversification , 2014, MMM.

[35]  Christopher G. Healey,et al.  Summarization techniques for visualization of large, multidimensional datasets , 2005 .

[36]  Yueting Zhuang,et al.  Topic aspect-oriented summarization via group selection , 2015, Neurocomputing.

[37]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[38]  Kai Song,et al.  Diversifying the image retrieval results , 2006, MM '06.

[39]  Takayuki Itoh,et al.  CAT: A Hierarchical Image Browser Using a Rectangle Packing Technique , 2008, 2008 12th International Conference Information Visualisation.

[40]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[41]  Hua Li,et al.  Improving web search results using affinity graph , 2005, SIGIR '05.

[42]  John C. Dalton,et al.  Hierarchical browsing and search of large image databases , 2000, IEEE Trans. Image Process..

[43]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[44]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[45]  Roelof van Zwol,et al.  Diversifying image search with user generated content , 2008, MIR '08.

[46]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[47]  João Magalhães,et al.  Multimodal medical information retrieval with unsupervised rank fusion , 2015, Comput. Medical Imaging Graph..