Efficient Scene Image Clustering for Internet Collections

This paper proposes an efficient approach to find clusters of spatially related scene images collected from the website. Our method firstly builds a guide table, in which the ranked results are given according to the relevance scores of image pairs obtained by the image retrieval methods. Then the image clusters are generated by repeatedly choosing a seed image and performing query expansion directed by the guide table. In the query process, feature matching is performed by using an affine invariant constraint which is presented to effectively reject outliers of the image feature correspondences. The proposed image clustering approach has been tested on the Bell Tower dataset consisting of more than 1K images which are collected from the photo-sharing website Flickr.com. The experimental results demonstrate the efficiency and effectiveness of our method.

[1]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[3]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[4]  Yang Song,et al.  Tour the world: a technical demonstration of a web-scale landmark recognition engine , 2009, ACM Multimedia.

[5]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Steven M. Seitz,et al.  Scene Summarization for Online Image Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[10]  Qing Wang,et al.  MAP Model for Large-scale 3D Reconstruction and Coarse Matching for Unordered Wide-baseline Photos , 2008, BMVC.

[11]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[12]  Yang Song,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..