Improved GPU near neighbours performance for multi-agent simulations

Abstract Complex systems simulations are well suited to the SIMT paradigm of GPUs, enabling millions of actors to be processed in fractions of a second. At the core of many such simulations, fixed radius near neighbours (FRRN) search provides the actors with spatial awareness of their neighbours. The FRNN search process is frequently the limiting factor of performance, due to the disproportionate level of scattered memory reads demanded by the query stage, leading to FRNN search runtimes exceeding that of simulation logic. In this paper, we propose and evaluate two novel optimisations (Strips and Proportional Bin Width) for improving the performance of uniform spatially partitioned FRNN searches and apply them in combination to demonstrate the impact on the performance of multi-agent simulations. The two approaches aim to reduce latency in search and reduce the amount of data considered (i.e. more efficient searching), respectively. When the two optimisations are combined, the peak obtained speedups observed in a benchmark model are 1.27x and 1.34x in two and three dimensional implementations, respectively. Due to additional non FRNN search computation, the peak speedup obtained when applied to complex system simulations within FLAMEGPU is 1.21x.

[1]  Renato Pajarola,et al.  Interactive SPH simulation and rendering on the GPU , 2010, SCA '10.

[2]  Pradeep Dubey,et al.  Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort , 2010, SIGMOD Conference.

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  Steve C. Maddock,et al.  A Standardised Benchmark for Assessing the Performance of Fixed Radius Near Neighbours , 2016, Euro-Par Workshops.

[5]  Andrew O. Finley,et al.  Efficient k-nearest neighbor searches for multi-source forest attribute mapping , 2008 .

[6]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[7]  Esteban Walter Gonzalez Clua,et al.  Neighborhood grid: A novel data structure for fluids animation with GPU computing , 2015, J. Parallel Distributed Comput..

[8]  A. Arnold,et al.  Harvesting graphics power for MD simulations , 2007, 0709.3225.

[9]  Richard J. Anderson Tree data structures for N-body simulation , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[10]  Giovanni Gallo,et al.  Advances in Multi-GPU Smoothed Particle Hydrodynamics Simulations , 2014, IEEE Transactions on Parallel and Distributed Systems.

[11]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[12]  L. Verlet Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .

[13]  Kothuri Venkata Ravi Kanth,et al.  Quadtree and R-tree indexes in oracle spatial: a comparison using GIS data , 2002, SIGMOD '02.

[14]  Ralph R. Martin,et al.  Pairwise Force SPH Model for Real-Time Multi-Interaction Applications , 2017, IEEE Transactions on Visualization and Computer Graphics.

[15]  Matthew Dickerson,et al.  Simple algorithms for enumerating interpoint distances and finding $k$ nearest neighbors , 1992, Int. J. Comput. Geom. Appl..

[16]  Vitaly Osipov,et al.  GPU sample sort , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[17]  Maxim Shevtsov,et al.  Highly Parallel Fast KD‐tree Construction for Interactive Ray Tracing of Dynamic Scenes , 2007, Comput. Graph. Forum.

[18]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[19]  Deok-Soo Kim,et al.  Region-expansion for the Voronoi diagram of 3D spheres , 2006, Comput. Aided Des..

[20]  Daniel Winkler,et al.  Neighbour lists for smoothed particle hydrodynamics on GPUs , 2017, Comput. Phys. Commun..

[21]  Yulong Zhang,et al.  A Special Sorting Method for Neighbor Search Procedure in Smoothed Particle Hydrodynamics on GPUs , 2015, 2015 44th International Conference on Parallel Processing Workshops.

[22]  Markus H. Gross,et al.  Optimized Spatial Hashing for Collision Detection of Deformable Objects , 2003, VMV.

[23]  Berk Hess,et al.  A flexible algorithm for calculating pair interactions on SIMD architectures , 2013, Comput. Phys. Commun..

[24]  Simon Green,et al.  Particle Simulation using CUDA , 2010 .

[25]  Duncan Poole,et al.  Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit Solvent Particle Mesh Ewald. , 2013, Journal of chemical theory and computation.

[26]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[27]  Sylvain Lefebvre,et al.  Perfect spatial hashing , 2006, ACM Trans. Graph..