论文信息 - Discovering states and transformations in image collections

Discovering states and transformations in image collections

Objects in visual scenes come in a rich variety of transformed states. A few classes of transformation have been heavily studied in computer vision: mostly simple, parametric changes in color and geometry. However, transformations in the physical world occur in many more flavors, and they come with semantic meaning: e.g., bending, folding, aging, etc. The transformations an object can undergo tell us about its physical and functional properties. In this paper, we introduce a dataset of objects, scenes, and materials, each of which is found in a variety of transformed states. Given a novel collection of images, we show how to explain the collection in terms of the states and transformations it depicts. Our system works by generalizing across object classes: states and transformations learned on one set of objects are used to interpret the image collection for an entirely new object class.

[1] Hossein Mobahi,et al. A Compositional Model for Low-Dimensional Image Set Representation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2] Michael Jones,et al. Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes , 2004, International Journal of Computer Vision.

[3] Jonghyun Choi,et al. Adding Unlabeled Samples to Categories by Learned Attributes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[5] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[6] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[7] Adriana Kovashka,et al. WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Yong Jae Lee,et al. Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[10] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[11] Alexei A. Efros,et al. Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships , 2009, NIPS.

[12] Ali Farhadi,et al. Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[14] ZissermanAndrew,et al. The Pascal Visual Object Classes Challenge , 2015 .

[15] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[16] Antonio Torralba,et al. Transfer Learning by Borrowing Examples for Multiclass Object Detection , 2011, NIPS.

[17] James Hays,et al. SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19] Xiaofeng Tao,et al. Transient attributes for high-level understanding and editing of outdoor scenes , 2014, ACM Trans. Graph..

[20] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21] Xinlei Chen,et al. NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.

[22] Antonio Criminisi,et al. TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[23] Alexander C. Berg,et al. Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[24] Eric P. Xing,et al. Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Kristen Grauman,et al. Relative attributes , 2011, 2011 International Conference on Computer Vision.

[26] VekslerOlga,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001 .

[27] Fei-Fei Li,et al. Shifting Weights: Adapting Object Detectors from Image to Video , 2012, NIPS.

[28] Antonio Torralba,et al. Infinite Images: Creating and Exploring a Large Photorealistic Virtual Space , 2008, Proceedings of the IEEE.

[29] Olga Veksler,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[30] Edward H. Adelson,et al. Recognizing Materials Using Perceptually Inspired Features , 2013, International Journal of Computer Vision.

[31] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Samy Bengio,et al. Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[33] Abhinav Gupta,et al. Beyond Nouns and Verbs , 2009 .

[34] Joshua B. Tenenbaum,et al. Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[35] Jonathan Weese,et al. UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[36] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.