In this paper, we present a novel framework to achieve effective summarization of large-scale web images by treating the problem of automatic image summarization as the problem of dictionary learning for sparse coding, e.g., the summary of a given image set can be treated as a sparse representation of the given image set (i.e., sparse dictionary for the given image set). For a given semantic category (i.e., certain object class or image concept), we build a sparsity model to reconstruct all its relevant images by using a subset of most representative images (i.e., image summary); and a stepwise basis selection algorithm is developed to learn such sparse dictionary (i.e., image summary) by minimizing an explicit optimization function. By investigating their reconstruction ability, the reconstruction Mean Square Error (MSE) is adapted to objectively measure the performance of various algorithms for automatic image summarization. Our experimental results demonstrate that our dictionary learning for sparse representation algorithm can obtain more accurate summary as compared with other baseline algorithms for automatic image summarization.
[1]
Balas K. Natarajan,et al.
Sparse Approximate Solutions to Linear Systems
,
1995,
SIAM J. Comput..
[2]
M. Fatih Demirci,et al.
Selecting canonical views for view-based 3-D object recognition
,
2004,
Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..
[3]
Steven M. Seitz,et al.
Scene Summarization for Online Image Collections
,
2007,
2007 IEEE 11th International Conference on Computer Vision.
[4]
Shumeet Baluja,et al.
Canonical image selection from the web
,
2007,
CIVR '07.
[5]
Mor Naaman,et al.
Generating summaries for large collections of geo-referenced photographs
,
2006,
WWW '06.
[6]
G LoweDavid,et al.
Distinctive Image Features from Scale-Invariant Keypoints
,
2004
.
[7]
Delbert Dueck,et al.
Clustering by Passing Messages Between Data Points
,
2007,
Science.
[8]
Andreas Krause,et al.
Submodular Dictionary Selection for Sparse Representation
,
2010,
ICML.
[9]
Youssef Hadi,et al.
Video summarization by k-medoid clustering
,
2006,
SAC '06.