Gaining knowledge out of vast datasets is a main challenge in data-driven applications nowadays. Sparse grids provide a numerical method for both classification and regression in data mining which scales only linearly in the number of data points and is thus well-suited for huge amounts of data. Due to the recursive nature of sparse grid algorithms, they impose a challenge for the parallelization on modern hardware architectures such as accelerators. In this paper, we present the parallelization on several current task- and data-parallel platforms, covering multi-core CPUs with vector units, GPUs, and hybrid systems. Furthermore, we analyze the suitability of parallel programming languages for the implementation.
Considering hardware, we restrict ourselves to the x86 platform with SSE and AVX vector extensions and to NVIDIA's Fermi architecture for GPUs. We consider both multi-core CPU and GPU architectures independently, as well as hybrid systems with up to 12 cores and 2 Fermi GPUs. With respect to parallel programming, we examine both the open standard OpenCL and Intel Array Building Blocks, a recently introduced high-level programming approach. As the baseline, we use the best results obtained with classically parallelized sparse grid algorithms and their OpenMP-parallelized intrinsics counterpart (SSE and AVX instructions), reporting both single and double precision measurements. The huge data sets we use are a real-life dataset stemming from astrophysics and an artificial one which exhibits challenging properties. In all settings, we achieve excellent results, obtaining speedups of more than 60 using single precision on a hybrid system.
[1]
Dirk Pflüger,et al.
Parallelizing a Black-Scholes solver based on finite elements and sparse grids
,
2010,
2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).
[2]
Pedro Trancoso,et al.
Exploiting the GPU to accelerate DSS query execution
,
2008,
CF '08.
[3]
Michael Griebel,et al.
Data Mining with Sparse Grids
,
2001,
Computing.
[4]
H. Bungartz,et al.
Sparse grids
,
2004,
Acta Numerica.
[5]
David M. Allen,et al.
The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction
,
1974
.
[6]
Pradeep Dubey,et al.
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU
,
2010,
ISCA.
[7]
Tomaso A. Poggio,et al.
Regularization Networks and Support Vector Machines
,
2000,
Adv. Comput. Math..
[8]
Hans-Joachim Bungartz,et al.
Acta Numerica 2004: Sparse grids
,
2004
.
[9]
Dirk Pflüger,et al.
Spatially Adaptive Sparse Grids for High-Dimensional Problems
,
2010
.
[10]
M. Raddick,et al.
The Fifth Data Release of the Sloan Digital Sky Survey
,
2007,
0707.3380.