Asymmetric Mapping Quantization for Nearest Neighbor Search

Nearest neighbor search is a fundamental problem in computer vision and machine learning. The straightforward solution, linear scan, is both computationally and memory intensive in large scale high-dimensional cases, hence is not preferable in practice. Therefore, there have been a lot of interests in algorithms that perform approximate nearest neighbor (ANN) search. In this paper, we propose a novel addition-based vector quantization algorithm, Asymmetric Mapping Quantization (AMQ), to efficiently conduct ANN search. Unlike existing addition-based quantization methods that suffer from handling the problem caused by the norm of database vector, we map the query vector and database vector using different mapping functions to transform the computation of L-2 distance to inner product similarity, thus do not need to evaluate the norm of database vector. Moreover, we further propose Distributed Asymmetric Mapping Quantization (DAMQ) to enable AMQ to work on very large dataset by distributed learning. Extensive experiments on approximate nearest neighbor search and image retrieval validate the merits of the proposed AMQ and DAMQ.

[1]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[2]  Junsong Yuan,et al.  Simultaneously Discovering and Localizing Common Objects in Wild Images , 2018, IEEE Transactions on Image Processing.

[3]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[4]  Zhe L. Lin,et al.  Distance Encoded Product Quantization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jingdong Wang,et al.  Inner Product Similarity Search using Compositional Codes , 2014, ArXiv.

[6]  Victor S. Lempitsky,et al.  Tree quantization for large-scale similarity search and classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Victor S. Lempitsky,et al.  Efficient Indexing of Billion-Scale Datasets of Deep Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[9]  Qingshan Liu,et al.  Additive Nearest Neighbor Feature Maps , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  David J. Fleet,et al.  Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Charu C. Aggarwal,et al.  Factorized Similarity Learning in Networks , 2014, 2014 IEEE International Conference on Data Mining.

[12]  Gaurav S. Sukhatme,et al.  Mobile Sensor Network Deployment using Potential Fields : A Distributed , Scalable Solution to the Area Coverage Problem , 2002 .

[13]  Junsong Yuan,et al.  Distributed Composite Quantization , 2018, AAAI.

[14]  Jingdong Wang,et al.  Composite Quantization , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jian Sun,et al.  Optimized Product Quantization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Charu C. Aggarwal,et al.  On clustering heterogeneous social media objects with outlier links , 2012, WSDM '12.

[17]  Junsong Yuan,et al.  Fried Binary Embedding: From High-Dimensional Visual Features to High-Dimensional Binary Codes , 2018, IEEE Transactions on Image Processing.

[18]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[19]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Junsong Yuan,et al.  Fried Binary Embedding for High-Dimensional Visual Features , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Jingdong Wang,et al.  Composite Quantization for Approximate Nearest Neighbor Search , 2014, ICML.

[23]  Kien A. Hua,et al.  Linear Subspace Ranking Hashing for Cross-Modal Retrieval , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Xian-Sheng Hua,et al.  Learning semantic distance from community-tagged media collection , 2009, MM '09.

[25]  Victor Lempitsky,et al.  Additive Quantization for Extreme Vector Compression , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Junli Liang,et al.  Distributed Dictionary Learning for Sparse Representation in Sensor Networks , 2014, IEEE Transactions on Image Processing.

[28]  Jinhui Tang,et al.  Supervised Quantization for Similarity Search , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[30]  Junsong Yuan,et al.  Tensorized Projection for High-Dimensional Binary Embedding , 2018, AAAI.

[31]  Jinhui Tang,et al.  Sparse composite quantization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Hao Hu,et al.  Learning Compact Features for Human Activity Recognition Via Probabilistic First-Take-All , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Kien A. Hua,et al.  Temporal Order-Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams , 2015, ICMR.

[35]  M. Hestenes Multiplier and gradient methods , 1969 .

[36]  Svetha Venkatesh,et al.  Distributed query processing for mobile surveillance , 2007, ACM Multimedia.

[37]  Sanjiv Kumar,et al.  Multiscale Quantization for Fast Similarity Search , 2017, NIPS.

[38]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[40]  James J. Little,et al.  Revisiting Additive Quantization , 2016, ECCV.

[41]  Victor S. Lempitsky,et al.  AnnArbor: Approximate Nearest Neighbors Using Arborescence Coding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  GhemawatSanjay,et al.  The Google file system , 2003 .