Image collection summarization via dictionary learning for sparse representation

In this paper, a novel framework is developed to achieve effective summarization of large-scale image collection by treating the problem of automatic image summarization as the problem of dictionary learning for sparse representation, e.g., the summarization task can be treated as a dictionary learning task (i.e., the given image set can be reconstructed sparsely with this dictionary). For image set of a specific category or a mixture of multiple categories, we have built a sparsity model to reconstruct all its images by using a subset of most representative images (i.e., image summary); and we adopted the simulated annealing algorithm to learn such sparse dictionary by minimizing an explicit optimization function. By investigating their reconstruction ability under sparsity constrain and diversity constrain, we have quantitatively measure the performance of various summarization algorithms. Our experimental results have shown that our dictionary learning for sparse representation algorithm can obtain more accurate summary as compared with other baseline algorithms.

[1]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Andreas Krause,et al.  Submodular Dictionary Selection for Sparse Representation , 2010, ICML.

[3]  Pinaki Sinha Summarization of archived and shared personal photo collections , 2011, WWW.

[4]  Xian-Sheng Hua,et al.  Interactive browsing via diversified visual summarization for image search results , 2011, Multimedia Systems.

[5]  Shumeet Baluja,et al.  Canonical image selection from the web , 2007, CIVR '07.

[6]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[7]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[8]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[9]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Mor Naaman,et al.  Generating summaries for large collections of geo-referenced photographs , 2006, WWW '06.

[11]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[12]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[14]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Youssef Hadi,et al.  Video summarization by k-medoid clustering , 2006, SAC '06.

[16]  M. Fatih Demirci,et al.  Selecting canonical views for view-based 3-D object recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[18]  Jeremiah D. Deng Content-based image collection summarization and comparison using self-organizing maps , 2007, Pattern Recognit..

[19]  Michael Elad,et al.  K-SVD : DESIGN OF DICTIONARIES FOR SPARSE REPRESENTATION , 2005 .