Multiple Exemplars Learning for Fast Image Retrieval

The past decade, we have witnessed rapid progress in compact representation learning for fast image retrieval. In the unsupervised scenario, product quantization (PQ) is one of the promising methods to generate compact image representation for fast and accurate retrieval. Inspired by the great success of deep neural network (DNN) achieved in computer vision, many works attempted to integrate PQ in DNN for end-to-end supervised training. Nevertheless, in existing deep PQ methods, data samples from different classes share the same codebook. Thus, they might be entangled with each other in the feature space. Meanwhile, existing deep PQ methods relying on triplet or pairwise loss require a huge number of training triplets or pairs, which are expensive in computation and scale poorly. In this work, we propose a multiple exemplars learning (MEL) approach to improve retrieval accuracy and training efficiency. For each class, we learn a class-specific codebook consisting of multiple exemplars to partition the class-specific feature space. Since the feature space as well as the codebook is class-specific, samples of different classes are disentangled in the feature space. We incorporate the proposed MEL in a convolutional neural network, supporting end-to-end training. Moreover, we propose MEL loss which trains the network in a considerably more efficient manner than existing deep product quantization approaches based on pairwise or triplet loss. Systematic experiments conducted on two public benchmarks demonstrate the effectiveness and efficiency of our method.

[1]  Ping Li,et al.  Fast Near Neighbor Search in High-Dimensional Binary Data , 2012, ECML/PKDD.

[2]  Junsong Yuan,et al.  Distributed Composite Quantization , 2018, AAAI.

[3]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[5]  Lior Wolf,et al.  End-To-End Supervised Product Quantization for Image Search and Retrieval , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[7]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[9]  Patrick Pérez,et al.  SuBiC: A Supervised, Structured Binary Code for Image Search , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  W. Rudin,et al.  Fourier Analysis on Groups. , 1965 .

[12]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[13]  Ping Li,et al.  Coding for Random Projections , 2013, ICML.

[14]  Ping Li,et al.  Theory of the GMM Kernel , 2016, WWW.

[15]  Ping Li,et al.  One-Sketch-for-All: Non-linear Random Features from Compressed Linear Measurements , 2021, AISTATS.

[16]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xi Zhang,et al.  Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval , 2017, ECCV.

[18]  Ping Li,et al.  Rejection Sampling for Weighted Jaccard Similarity Revisited , 2021, AAAI.

[19]  Xiaoyan Gu,et al.  Fast and Multilevel Semantic-Preserving Discrete Hashing , 2019, BMVC.

[20]  Lijun Zhang,et al.  Semi-Supervised Deep Hashing with a Bipartite Graph , 2017, IJCAI.

[21]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  David J. Fleet,et al.  VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.

[23]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[24]  Kunal Talwar,et al.  Consistent Weighted Sampling , 2007 .

[25]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[26]  Tieniu Tan,et al.  Deep Supervised Discrete Hashing , 2017, NIPS.

[27]  Shulong Tan,et al.  Fast Item Ranking under Neural Network based Measures , 2020, WSDM.

[28]  Gustavo Carneiro,et al.  A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jianmin Wang,et al.  Deep Hashing Network for Efficient Similarity Retrieval , 2016, AAAI.

[30]  Ping Li,et al.  Möbius Transformation for Fast Inner Product Search on Graph , 2019, NeurIPS.

[31]  Yi Shi,et al.  Deep Supervised Hashing with Triplet Labels , 2016, ACCV.

[32]  Stan Sclaroff,et al.  Hashing with Mutual Information , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Ping Li,et al.  GPU-Based Minwise Hashing , 2012 .

[34]  Song Bai,et al.  Triplet-Center Loss for Multi-view 3D Object Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Nasser M. Nasrabadi,et al.  Image coding using vector quantization: a review , 1988, IEEE Trans. Commun..

[36]  Hailin Jin,et al.  Product Quantization Network for Fast Visual Search , 2020, International Journal of Computer Vision.

[37]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[39]  Yue Gao,et al.  Deep Multi-View Enhancement Hashing for Image Retrieval , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[42]  Kun He,et al.  Hashing with Binary Matrix Pursuit , 2018, ECCV.

[43]  Kun Gai,et al.  Learning Tree-based Deep Model for Recommender Systems , 2018, KDD.

[44]  Kun He,et al.  Hashing as Tie-Aware Learning to Rank , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Xiaodong Chen,et al.  Combo-Attention Network for Baidu Video Advertising , 2020, KDD.

[46]  Ping Li,et al.  Binary and Multi-Bit Coding for Stable Random Projections , 2015, AISTATS.

[47]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[48]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[49]  John R. Smith,et al.  SPIRE: a progressive content-based spatial image retrieval engine , 2000, SIGMOD '00.

[50]  Sergey Ioffe,et al.  Improved Consistent Sampling, Weighted Minhash and L1 Sketching , 2010, 2010 IEEE International Conference on Data Mining.

[51]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Shuguang Han,et al.  A Stochastic Treatment of Learning to Rank Scoring Functions , 2020, WSDM.

[53]  Wu-Jun Li,et al.  Asymmetric Deep Supervised Hashing , 2017, AAAI.

[54]  Junsong Yuan,et al.  Product Quantization Network for Fast Image Retrieval , 2018, ECCV.

[55]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[56]  Ping Li,et al.  On Efficient Retrieval of Top Similarity Vectors , 2019, EMNLP.

[57]  Yilong Yin,et al.  Supervised Discrete Hashing With Mutual Linear Regression , 2019, ACM Multimedia.

[58]  Victor Lempitsky,et al.  Additive Quantization for Extreme Vector Compression , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[59]  Jianmin Wang,et al.  Collective Deep Quantization for Efficient Cross-Modal Retrieval , 2017, AAAI.

[60]  Jianmin Wang,et al.  Deep Cauchy Hashing for Hamming Space Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Jiwen Lu,et al.  Deep Variational and Structural Hashing , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Ping Li,et al.  Compressed counting , 2008, SODA.

[63]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[64]  Jian Sun,et al.  Optimized Product Quantization for Approximate Nearest Neighbor Search , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Jianmin Wang,et al.  Deep Visual-Semantic Quantization for Efficient Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[67]  Ping Li,et al.  SONG: Approximate Nearest Neighbor Search on GPU , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[68]  Weixiang Hong,et al.  GilBERT: Generative Vision-Language Pre-Training for Image-Text Retrieval , 2021, SIGIR.

[69]  Ping Li Linearized GMM Kernels and Normalized Random Fourier Features , 2017, KDD.

[70]  Ping Li,et al.  Cross-lingual Cross-modal Pretraining for Multimodal Retrieval , 2021, NAACL.

[71]  Xiu-Shen Wei,et al.  ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval , 2020, ECCV.

[72]  Jianmin Wang,et al.  HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[73]  Kun He,et al.  MIHash: Online Hashing with Mutual Information , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[74]  Jiwen Lu,et al.  Deep Hashing via Discrepancy Minimization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[75]  Dacheng Tao,et al.  DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Jiashi Feng,et al.  Central Similarity Hashing via Hadamard matrix , 2019, ArXiv.

[77]  Bhaskar Mitra,et al.  Neural Ranking Models with Multiple Document Fields , 2017, WSDM.

[78]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[79]  Weijie Zhao,et al.  TIRA in Baidu Image Advertising , 2021, 2021 IEEE 37th International Conference on Data Engineering (ICDE).

[80]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[81]  Sanjoy Dasgupta,et al.  Experiments with Random Projection , 2000, UAI.

[82]  Jianmin Wang,et al.  Deep Quantization Network for Efficient Image Retrieval , 2016, AAAI.

[83]  Jing Liu,et al.  Deep Incremental Hashing Network for Efficient Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[85]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Philip S. Yu,et al.  HashNet: Deep Learning to Hash by Continuation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).