Employing GPU architectures for permutation-based indexing

Permutation-based indexing is one of the most popular techniques for the approximate nearest-neighbor search problem in high-dimensional spaces. Due to the exponential increase of multimedia data, the time required to index this data has become a serious constraint. One of the possible steps towards faster index construction is utilization of massively parallel platforms such as the GPGPU architectures. In this paper, we have analyzed the computational costs of individual steps of the permutation-based index construction in a high-dimensional feature space and summarized our hybrid CPU-GPU solution. Our experience gained from this research may be utilized in other individual problems that require computing Lp distances in high-dimensional spaces, parallel top-k selection, or partial sorting of multiple smaller sets. We also provide guidelines how to balance workload in hybrid CPU-GPU systems.

[1]  Timothy J. Purcell Sorting and searching , 2005, SIGGRAPH Courses.

[2]  Mariela Lopresti,et al.  Solving Multiple Queries through a Permutation Index in GPU , 2013 .

[3]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[4]  Edgar Chávez,et al.  A Brief Index for Proximity Searching , 2009, CIARP.

[5]  David Novak,et al.  On locality-sensitive indexing in generic metric spaces , 2010, SISAP.

[6]  James Reinders,et al.  Intel® threading building blocks , 2008 .

[7]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[8]  Andrea Esuli MiPai: Using the PP-Index to Build an Efficient and Scalable Similarity Search System , 2009, 2009 Second International Workshop on Similarity Search and Applications.

[9]  Marco Patella,et al.  PAC nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[10]  Gonzalo Navarro,et al.  Effective Proximity Retrieval by Ordering Permutations , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Chuck Pheatt,et al.  Intel® threading building blocks , 2008 .

[12]  Laura Monroe,et al.  Randomized selection on the GPU , 2011, HPG '11.

[13]  Hanan Samet,et al.  Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling) , 2005 .

[14]  Jakub Yaghob,et al.  Task scheduling in hybrid CPU-GPU systems , 2013 .

[15]  Norbert Luttenberger,et al.  Fast In-Place Sorting with CUDA Based on Bitonic Sort , 2009, PPAM.

[16]  Jason Sanders,et al.  CUDA by example: an introduction to general purpose GPU programming , 2010 .

[17]  Omar Usman Khan,et al.  Fast Parallel Sorting Algorithms on GPUs , 2012 .

[18]  Qi Li,et al.  A Chunking Method for Euclidean Distance Matrix Calculation on Large Dataset Using Multi-GPU , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[19]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[20]  Dongrui Fan,et al.  High performance comparison-based sorting algorithm on many-core GPUs , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[21]  Martin Krulis,et al.  Optimizing Sorting and Top-k Selection Steps in Permutation Based Indexing on GPUs , 2015, ADBIS.

[22]  Claudio Gennaro,et al.  MI-File: using inverted files for scalable approximate similarity search , 2012, Multimedia Tools and Applications.

[23]  Alberto O. Mendelzon,et al.  Similarity-based queries , 1995, PODS '95.

[24]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[25]  Stéphane Marchand-Maillet,et al.  Multi-Core (CPU and GPU) for Permutation-Based Indexing , 2014, SISAP.

[26]  Michael Garland,et al.  Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[27]  Stéphane Marchand-Maillet,et al.  Quantized ranking for permutation-based indexing , 2013, Inf. Syst..

[28]  Marco Patella,et al.  Approximate similarity search: A multi-faceted problem , 2009, J. Discrete Algorithms.

[29]  Jeffrey D. Blanchard,et al.  Fast k-selection algorithms for graphics processing units , 2012, JEAL.

[30]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[31]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[32]  Ming Ouyang,et al.  COMPUTE PAIRWISE EUCLIDEAN DISTANCES OF DATA POINTS WITH GPUS , 2008 .

[33]  Pasquale Savino,et al.  Approximate similarity search in metric spaces using inverted files , 2008, Infoscale.

[34]  Stéphane Marchand-Maillet,et al.  Parallel Approaches to Permutation-Based Indexing Using Inverted Files , 2012, SISAP.

[35]  Stéphane Marchand-Maillet,et al.  Permutation based indexing for high dimensional data on GPU architectures , 2015, 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI).

[36]  Mihalis Yannakakis,et al.  Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 22-25, 1995, San Jose, California, USA , 1995, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[37]  Andrea Esuli,et al.  PP-Index: Using Permutation Prefixes for Efficient and Scalable Approximate Similarity Search , 2009, LSDS-IR@SIGIR.