Multi-attribute Queries: To Merge or Not to Merge?

Users often have very specific visual content in mind that they are searching for. The most natural way to communicate this content to an image search engine is to use key-words that specify various properties or attributes of the content. A naive way of dealing with such multi-attribute queries is the following: train a classifier for each attribute independently, and then combine their scores on images to judge their fit to the query. We argue that this may not be the most effective or efficient approach. Conjunctions of attribute often correspond to very characteristic appearances. It would thus be beneficial to train classifiers that detect these conjunctions as a whole. But not all conjunctions result in such tight appearance clusters. So given a multi-attribute query, which conjunctions should we model? An exhaustive evaluation of all possible conjunctions would be time consuming. Hence we propose an optimization approach that identifies beneficial conjunctions without explicitly training the corresponding classifier. It reasons about geometric quantities that capture notions similar to intra- and inter-class variances. We exploit a discriminative binary space to compute these geometric quantities efficiently. Experimental results on two challenging datasets of objects and birds show that our proposed approach can improve performance significantly over several strong base-lines, while being an order of magnitude faster than exhaustively searching through all possible conjunctions.

[1]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[2]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[3]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[4]  Nuno Vasconcelos,et al.  Bridging the Gap: Query by Semantic Example , 2007, IEEE Transactions on Multimedia.

[5]  Shih-Fu Chang,et al.  CuZero: embracing the frontier of interactive visual search for informed users , 2008, MIR '08.

[6]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[9]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[10]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[11]  Cordelia Schmid,et al.  Combining attributes and Fisher vectors for efficient image retrieval , 2011, CVPR 2011.

[12]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[13]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[14]  Xiaogang Wang,et al.  Query-specific visual semantic spaces for web image re-ranking , 2011, CVPR 2011.

[15]  Ali Farhadi,et al.  Recognition using visual phrases , 2011, CVPR 2011.

[16]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Hossein Mobahi,et al.  Toward a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Terrance E. Boult,et al.  Multi-attribute spaces: Calibration for attribute fusion and similarity search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[20]  Ali Farhadi,et al.  Attribute Discovery via Predictable Discriminative Binary Codes , 2012, ECCV.

[21]  Tsuhan Chen,et al.  Automatic discovery of groups of objects for scene understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Ali Farhadi,et al.  Object-Centric Anomaly Detection by Attribute-Based Reasoning , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Liudvika Leisyte,et al.  To merge or not to merge , 2015 .