WhittleSearch: Interactive Image Search with Relative Attribute Feedback

We propose a novel mode of feedback for image search, where a user describes which properties of exemplar images should be adjusted in order to more closely match his/her mental model of the image sought. For example, perusing image results for a query “black shoes”, the user might state, “Show me shoe images like these, but sportier.” Offline, our approach first learns a set of ranking functions, each of which predicts the relative strength of a nameable attribute in an image (e.g., sportiness). At query time, the system presents the user with a set of exemplar images, and the user relates them to his/her target image with comparative statements. Using a series of such constraints in the multi-dimensional attribute space, our method iteratively updates its relevance function and re-ranks the database of images. To determine which exemplar images receive feedback from the user, we present two variants of the approach: one where the feedback is user-initiated and another where the feedback is actively system-initiated. In either case, our approach allows a user to efficiently “whittle away” irrelevant portions of the visual feature space, using semantic language to precisely communicate her preferences to the system. We demonstrate our technique for refining image search for people, products, and scenes, and we show that it outperforms traditional binary relevance feedback in terms of search speed and accuracy. In addition, the ordinal nature of relative attributes helps make our active approach efficient—both computationally for the machine when selecting the reference images, and for the user by requiring less user interaction than conventional passive and active methods.

[1]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[2]  Kristen Grauman,et al.  Interactively building a discriminative vocabulary of nameable attributes , 2011, CVPR 2011.

[3]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[4]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Shree K. Nayar,et al.  FaceTracer: A Search Engine for Large Collections of Images with Faces , 2008, ECCV.

[6]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[7]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[8]  Tsuhan Chen,et al.  An active learning framework for content-based information retrieval , 2002, IEEE Trans. Multim..

[9]  Kristen Grauman,et al.  Implied Feedback: Learning Nuances of User Behavior in Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Raphael Sznitman,et al.  Active Testing for Face Detection and Localization , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[12]  Adriana Kovashka,et al.  WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Adriana Kovashka,et al.  Actively selecting annotations among objects and attributes , 2011, 2011 International Conference on Computer Vision.

[14]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[16]  Xiaogang Wang,et al.  Query-specific visual semantic spaces for web image re-ranking , 2011, CVPR 2011.

[17]  T. van Amelsvoort Bridging the Gap , 2014, Tijdschrift voor psychiatrie.

[18]  Subhransu Maji Discovering a Lexicon of Parts and Attributes , 2012, ECCV Workshops.

[19]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[20]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[21]  劉子榮 Probabilistic Output of Support Vector Machines , 2004 .

[22]  S Brewer To merge or not to merge? , 2001, The Journal of the Arkansas Medical Society.

[23]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[24]  Marin Ferecatu,et al.  Interactive Search for Image Categories by Mental Matching , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[26]  Ali Farhadi,et al.  Object-Centric Anomaly Detection by Attribute-Based Reasoning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Arijit Biswas,et al.  Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Subhransu Maji,et al.  Similarity Comparisons for Interactive Fine-Grained Categorization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Louis Chevallier,et al.  Transfer learning via attributes for improved on-the-fly classification , 2014, IEEE Winter Conference on Applications of Computer Vision.

[30]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[31]  Gabriela Csurka,et al.  Learning structured prediction models for interactive image labeling , 2011, CVPR 2011.

[32]  Yang Wang,et al.  A Discriminative Latent Model of Object Classes and Attributes , 2010, ECCV.

[33]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[34]  Edward Y. Chang,et al.  Learning image query concepts via intelligent sampling , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[35]  Chi-Ren Shyu,et al.  Relevance feedback decision trees in content-based image retrieval , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[36]  Catherine Wah,et al.  Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Terrance E. Boult,et al.  Multi-attribute spaces: Calibration for attribute fusion and similarity search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Kristen Grauman,et al.  Decorrelating Semantic Visual Attributes by Resisting the Urge to Share , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Shih-Fu Chang,et al.  CuZero: embracing the frontier of interactive visual search for informed users , 2008, MIR '08.

[40]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[41]  Cordelia Schmid,et al.  Combining attributes and Fisher vectors for efficient image retrieval , 2011, CVPR 2011.

[42]  Toshikazu Kato,et al.  Learning of personal visual impression for image database systems , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[43]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[44]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[45]  Nuno Vasconcelos,et al.  Bridging the Gap: Query by Semantic Example , 2007, IEEE Transactions on Multimedia.

[46]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[47]  Adriana Kovashka,et al.  Attribute Adaptation for Personalized Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[48]  Adriana Kovashka,et al.  Attribute Pivots for Guiding Relevance Feedback in Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[49]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Ashish Kapoor,et al.  Visual recognition and detection under bounded computational resources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51]  Donald Geman,et al.  Model-based classification trees , 2001, IEEE Trans. Inf. Theory.

[52]  Jake K. Aggarwal,et al.  CIRES: a system for content-based retrieval in digital image libraries , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..

[53]  Chen Xu,et al.  The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding , 2014, International Journal of Computer Vision.

[54]  Devi Parikh,et al.  Attributes for Classifier Feedback , 2012, ECCV.

[55]  Ali Farhadi,et al.  Multi-attribute Queries: To Merge or Not to Merge? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  B. S. Manjunath,et al.  NeTra: A toolbox for navigating large image databases , 1997, Proceedings of International Conference on Image Processing.

[58]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.