Unsupervised Patch-based Context from Millions of Images

The amount of labeled training data required for image interpretation tasks is a major drawback of current methods. How can we use the gigantic collection of unlabeled images available on the web to aid these tasks? In this paper, we present a simple approach based on the notion of patch-based context to extract useful priors for regions within a query image from a large collection of (6 million) unlabeled images. This contextual prior over image classes acts as a non-redundant complimentary source of knowledge that helps in disambiguating the confusions within the predictions of local region-level features. We demonstrate our approach on the challenging tasks of region classification and surfacelayout estimation.

[1]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[2]  Zhuowen Tu,et al.  Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Robert D. Nowak,et al.  Unlabeled data: Now it helps, now it doesn't , 2008, NIPS.

[4]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[5]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[6]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[7]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Alexei A. Efros,et al.  Can similar scenes help surface layout estimation? , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[9]  Christoph H. Lampert Detecting objects in large image collections and videos by efficient subimage retrieval , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[11]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Antonio Torralba,et al.  Nonparametric scene parsing: Label transfer via dense scene alignment , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[15]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[16]  Antonio Torralba,et al.  Object Recognition by Scene Alignment , 2007, NIPS.

[17]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[18]  Pushmeet Kohli,et al.  Exact inference in multi-label CRFs with higher order cliques , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[20]  Daniel Munoz,et al.  On Two Methods for Semi-Supervised Structured Prediction , 2010 .

[21]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Alexei A. Efros,et al.  Segmenting Scenes by Matching Image Composites , 2009, NIPS.

[23]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[24]  Vikas Sindhwani,et al.  On Manifold Regularization , 2005, AISTATS.

[25]  Jianxiong Xiao,et al.  Supervised Label Transfer for Semantic Segmentation of Street Scenes , 2010, ECCV.

[26]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..