Efficient Neighbor Search for Particle Methods on GPUs

In this paper we present an efficient and general sorting-based approach for the neighbor search on GPUs. Finding neighbors of a particle is a common task in particle methods and has a significant impact on the overall computational effort–especially in dynamics simulations. We extend a space-filling curve algorithm presented in Connor and Kumar (IEEE Trans Vis Comput Graph, 2009) for its usage on GPUs with the parallel computing model Compute Unified Device Architecture (CUDA). To evaluate our implementation, we consider the respective execution time of our GPU search algorithm, for the most common assemblies of particles: a regular grid, uniformly distributed random points and cluster points in 2 and 3 dimensions. The measured computational time is compared with the theoretical time complexity of the extended algorithm and the computational time of its reference single-core implementation. The presented results show a speed up of factor of 4 comparing the GPU and CPU run times.

[1]  J. Monaghan,et al.  Smoothed particle hydrodynamics: Theory and application to non-spherical stars , 1977 .

[2]  Dirk Pflüger,et al.  Lecture Notes in Computational Science and Engineering , 2010 .

[3]  Piyush Kumar,et al.  Fast construction of k-nearest neighbor graphs for point clouds , 2010, IEEE Transactions on Visualization and Computer Graphics.

[4]  João Marcelo X. N. Teixeira,et al.  Nearest Neighbor Searches on the GPU , 2011, International Journal of Parallel Programming.

[5]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[6]  Michael Griebel,et al.  Numerical Simulation in Molecular Dynamics: Numerics, Algorithms, Parallelization, Applications , 2007 .

[7]  Hermann Tropf,et al.  Multimensional Range Search in Dynamically Balanced Trees , 1981, Angew. Inform..

[8]  Sameer A. Nene,et al.  A simple algorithm for nearest neighbor search in high dimensions , 1997 .

[9]  S. Silling Reformulation of Elasticity Theory for Discontinuities and Long-Range Forces , 2000 .

[10]  Srinivas Aluru,et al.  Parallel domain decomposition and load balancing using space-filling curves , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[11]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[12]  Marc Alexander Schweitzer,et al.  A Parallel Multilevel Partition of Unity Method for Elliptic Partial Differential Equations , 2003, Lecture Notes in Computational Science and Engineering.

[13]  Roman G. Strongin,et al.  Introduction to Global Optimization Exploiting Space-Filling Curves , 2013 .

[14]  Ken Kennedy,et al.  Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.

[15]  Michael Bader,et al.  Space-Filling Curves - An Introduction with Applications in Scientific Computing , 2012, Texts in Computational Science and Engineering.

[16]  Steven J. Plimpton,et al.  Implementing peridynamics within a molecular dynamics code , 2007, Comput. Phys. Commun..

[17]  M. S. Warren,et al.  A parallel hashed Oct-Tree N-body algorithm , 1993, Supercomputing '93.

[18]  S. Silling,et al.  A meshfree method based on the peridynamic model of solid mechanics , 2005 .

[19]  Ulf Assarsson,et al.  Fast parallel GPU-sorting using a hybrid algorithm , 2008, J. Parallel Distributed Comput..

[20]  Ali Dashti,et al.  Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters , 2013, PloS one.

[21]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[22]  Michael Garland,et al.  Designing efficient sorting algorithms for manycore GPUs , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.