Computing iconic summaries of general visual concepts

This paper considers the problem of selecting iconic images to summarize general visual categories. We define iconic images as high-quality representatives of a large group of images consistent both in appearance and semantics. To find such groups, we perform joint clustering in the space of global image descriptors and latent topic vectors of tags associated with the images. To select the representative iconic images for the joint clusters, we use a quality ranking learned from a large collection of labeled images. For the purposes of visualization, iconic images are grouped by semantic ldquothemerdquo and multidimensional scaling is used to compute a 2D layout that reflects the relationships between the themes. Results on four large-scale datasets demonstrate the ability of our approach to discover plausible themes and recurring visual motifs for challenging abstract concepts such as ldquoloverdquo and ldquobeautyrdquo.

[1]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[2]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  David A. Forsyth,et al.  Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[5]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[6]  Andrew Blake,et al.  Digital tapestry [automatic image synthesis] , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Sven J. Dickinson,et al.  Selecting canonical views for view-based 3-D object recognition , 2004, ICPR 2004.

[9]  James Ze Wang,et al.  Studying Aesthetics in Photographic Images Using a Computational Approach , 2006, ECCV.

[10]  Tamara L. Berg,et al.  Automatic Ranking of Iconic Images , 2007 .

[11]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[12]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  M J Tarr,et al.  What Object Attributes Determine Canonical Views? , 1999, Perception.

[14]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[15]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[16]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Peter M. Hall,et al.  Simple Canonical Views , 2005, BMVC.

[18]  Shumeet Baluja,et al.  Canonical image selection from the web , 2007, CIVR '07.

[19]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.