A fast algorithm for identifying Friends-of-Friends halos

We describe a simple and fast algorithm for identifying friends-of-friends features and prove its correctness. The algorithm avoids unnecessary expensive neighbor queries, uses minimal memory overhead, and rejects slowdown in high over-density regions. We define our algorithm formally based on pair enumeration, a problem that has been heavily studied in fast 2-point correlation codes and our reference implementation employs a dual KD-tree correlation function code. We construct features in a hierarchical tree structure, and use a splay operation to reduce the average cost of identifying the root of a feature from $O[\log L]$ to $O[1]$ ($L$ is the size of a feature) without additional memory costs. This reduces the overall time complexity of merging trees from $O[L\log L]$ to $O[L]$, reducing the number of operations per splay by orders of magnitude. We next introduce a pruning operation that skips merge operations between two fully self-connected KD-tree nodes. This improves the robustness of the algorithm, reducing the number of merge operations in high density peaks from $O[\delta^2]$ to $O[\delta]$. We show that for cosmological data set the algorithm eliminates more than half of merge operations for typically used linking lengths $b \sim 0.2$ (relative to mean separation). Furthermore, our algorithm is extremely simple and easy to implement on top of an existing pair enumeration code, reusing the optimization effort that has been invested in fast correlation function codes.

[1]  Jean M. Sexton,et al.  Nyx: A MASSIVELY PARALLEL AMR CODE FOR COMPUTATIONAL COSMOLOGY , 2013, J. Open Source Softw..

[2]  Manodeep Sinha Corrfunc: Corrfunc-1.0.0 , 2016 .

[3]  Martin White,et al.  Cluster galaxy dynamics and the effects of large-scale environment , 2010, 1005.3022.

[4]  Romain Teyssier,et al.  The effects of baryon physics, black holes and active galactic nucleus feedback on the mass distribution in clusters of galaxies , 2011, 1112.2752.

[5]  H. Liu,et al.  A New Galaxy Group Finding Algorithm: Probability Friends-of-Friends , 2008 .

[6]  Felipe Marin,et al.  Fast and accurate mock catalogue generation for low-mass galaxies , 2015, 1507.05329.

[7]  Risa H. Wechsler,et al.  THE ROCKSTAR PHASE-SPACE TEMPORAL HALO FINDER AND THE VELOCITY OFFSETS OF CLUSTER CORES , 2011, 1110.4372.

[8]  G. Efstathiou,et al.  The evolution of large-scale structure in a universe dominated by cold dark matter , 1985 .

[9]  J. Huchra,et al.  Groups of galaxies. III. the CfA survey , 1983 .

[10]  Eugene Fink,et al.  DiscFinder: a data-intensive scalable cluster finder for astrophysics , 2010, HPDC '10.

[11]  P. Mcdonald,et al.  FastPM: a new scheme for fast simulations of dark matter and haloes , 2016, 1603.00476.

[12]  Magdalena Balazinska,et al.  Scalable Clustering Algorithm for N-Body Simulations in a Shared-Nothing Cluster , 2010, SSDBM.

[13]  G. Bernstein,et al.  The skewness of the aperture mass statistic , 2003 .

[14]  D. Murphy,et al.  orca: The Overdense Red-sequence Cluster Algorithm , 2011, 1109.3182.

[15]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[16]  L. Wasserman,et al.  Fast Algorithms and Efficient Statistics: N-Point Correlation Functions , 2000, astro-ph/0012333.

[17]  Brigitta Sipocz,et al.  Forward Modeling of Large-scale Structure: An Open-source Approach with Halotools , 2016, 1606.04106.

[18]  Guy E. Blelloch,et al.  Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.

[19]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[20]  P. Hopkins,et al.  Galaxies on FIRE (Feedback In Realistic Environments): stellar feedback explains cosmologically inefficient star formation , 2013, 1311.2073.

[21]  Michael Ian Shamos,et al.  Closest-point problems , 1975, 16th Annual Symposium on Foundations of Computer Science (sfcs 1975).

[22]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[23]  Devin W. Silvia,et al.  ENZO: AN ADAPTIVE MESH REFINEMENT CODE FOR ASTROPHYSICS , 2013, J. Open Source Softw..

[24]  V. Springel The Cosmological simulation code GADGET-2 , 2005, astro-ph/0505010.

[25]  D. Eisenstein,et al.  HOP: A New Group-finding Algorithm for N-Body Simulations , 1997, astro-ph/9712200.