Data-driven visual similarity for cross-domain image matching

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc. We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain. Our approach shows good performance on a number of difficult cross-domain visual tasks e.g., matching paintings or sketches to real photographs. The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps. While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well.

[1]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[2]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[3]  David Salesin,et al.  Image Analogies , 2001, SIGGRAPH.

[4]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[5]  William T. Freeman,et al.  Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[8]  R. Sukthankar,et al.  Object-based image retrieval using the statistical structure of images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  John Hart,et al.  ACM Transactions on Graphics , 2004, SIGGRAPH 2004.

[10]  Derek Hoiem,et al.  Object-based image retrieval using the statistical structure of images , 2004, CVPR 2004.

[11]  Paul A. Viola,et al.  Boosting Image Retrieval , 2004, International Journal of Computer Vision.

[12]  Michal Irani,et al.  Detecting Irregularities in Images and in Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[16]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[17]  Eli Shechtman,et al.  Space-Time Completion of Video , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[20]  Richard Szeliski,et al.  Finding paths through the world's photos , 2008, ACM Trans. Graph..

[21]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[22]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[23]  A. Torralba,et al.  Creating and exploring a large photorealistic virtual space , 2010, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[24]  Steven J. Gortler,et al.  A perception-based color space for illumination-invariant image processing , 2008, SIGGRAPH 2008.

[25]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[26]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Steven J. Gortler,et al.  A perception-based color space for illumination-invariant image processing , 2008, ACM Trans. Graph..

[28]  Wojciech Matusik,et al.  Image restoration using online photo collections , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Cordelia Schmid,et al.  Spatial pyramid matching , 2009 .

[30]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[31]  Andrew Zisserman,et al.  Get Out of my Picture! Internet-based Inpainting , 2009, BMVC.

[32]  Alexei A. Efros,et al.  Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships , 2009, NIPS.

[33]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[34]  Tal Hassner,et al.  The One-Shot similarity kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[35]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[37]  Raanan Fattal,et al.  Image upsampling via texture hallucination , 2010, 2010 IEEE International Conference on Computational Photography (ICCP).

[38]  Marc Alexa,et al.  Sketch-based 3D shape retrieval , 2010, SIGGRAPH '10.

[39]  Antonio Torralba,et al.  Infinite Images: Creating and Exploring a Large Photorealistic Virtual Space , 2008, Proceedings of the IEEE.

[40]  Frédo Durand,et al.  Computational rephotography , 2010, TOGS.

[41]  Dhiraj Joshi,et al.  Object Categorization, Computer and Human Vision Perspectives , 2010, J. Electronic Imaging.

[42]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, ACM Trans. Graph..

[43]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[44]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, SIGGRAPH 2011.

[45]  Marc Alexa,et al.  Sketch-Based Image Retrieval: Benchmark and Bag-of-Features Descriptors , 2011, IEEE Transactions on Visualization and Computer Graphics.

[46]  Jean Ponce,et al.  Automatic alignment of paintings and photographs depicting a 3D scene , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[47]  Wojciech Matusik,et al.  CG2Real: Improving the Realism of Computer Generated Images Using a Large Collection of Photographs , 2011, IEEE Transactions on Visualization and Computer Graphics.

[48]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.