Using a Multitasking GPU Environment for Content-Based Similarity Measures of Big Data

Performance and efficiency became recently key requirements of computer architectures. Modern computers incorporate Graphics Processing Units (GPUs) into running data mining algorithms, as well as other general purpose computations. In this paper, different parallelization methods are analyzed and compared in order to understand their applicability. From multi-threading on shared memory to using NVIDIA’s GPU accelerators for increasing performance and efficiency on parallel computing, this work discusses the parallelization of data mining algorithms considering performance and efficiency issues. The performance is compared on both many-core systems and GPU accelerators on a distance measure algorithm using a relatively big data set. We optimize the way we deal with GPUs in heterogeneous systems to make them more suitable for big data mining applications with heavy distance calculations. Moreover, we focus on achieving a higher utilization of GPU resources and a better reuse of data. Our implementation of the content-based similarity algorithm SQFD on the GPU outperforms by up to 50× CPU counterparts, and up to 15× CPU multi-threaded implementations.

[1]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Hans-Peter Kriegel,et al.  Density-Connected Subspace Clustering for High-Dimensional Data , 2004, SDM.

[3]  William J. Dally,et al.  Compiling for stream processing , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  H.-F. Pabst,et al.  Ray Casting of Trimmed NURBS Surfaces on the GPU , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[5]  Thomas Seidl,et al.  Signature Quadratic Form Distance , 2010, CIVR '10.

[6]  Tobias Preis,et al.  Econophysics — complex correlations and trend switchings in financial time series , 2011 .

[7]  Mohamed Medhat Gaber,et al.  Density-Based Projected Clustering of Data Streams , 2012, SUM.

[8]  Ahmed Mohamed Hassan Abdalla Applications Performance on GPGPUs with the Fermi Architecture , 2011 .

[9]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[10]  William J. Dally,et al.  Communication Scheduling , 2000, ASPLOS.

[11]  Anthony K. H. Tung,et al.  Scalable Clustering Using Graphics Processors , 2006, WAIM.

[12]  Stefan Lankes,et al.  The development of a scheduling system GPUSched for graphics processing units , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[13]  Ralph Duncan Parallel Computer Architectures , 1992, Adv. Comput..

[14]  Martin Krulis,et al.  Processing the signature quadratic form distance on many-core GPU architectures , 2011, CIKM '11.