nbodykit: A Python Toolkit for Cosmology Simulations and Data Analysis on Parallel HPC Systems

We present nbodykit, an open source, massively parallel Python toolkit for cosmology simulations and data analysis developed for high performance computing machines. We discuss the challenges encountered while designing parallel and scalable software in Python that still exploits the unique interactive tools provided by the Python stack. Using the mpi4py library, nbodykit implements a fully parallel, canonical set of algorithms in the field of large-scale structure cosmology and also includes a set of distributed data containers, insulated from the algorithms themselves. We use the dask library to provide a straightforward method for users to manipulate data without worrying about the costs of parallel IO operations. We take advantage of the readability of Python as an interpreted language by implementing nbodykit in pure Python, while ensuring high performance by relying on external, compiled libraries, optimized for specific tasks. We demonstrate the ease of use and performance capabilities of nbodykit with several real-world scenarios in the field of cosmology.

[1]  W. M. Wood-Vasey,et al.  The clustering of galaxies in the completed SDSS-III Baryon Oscillation Spectroscopic Survey: cosmological analysis of the DR12 galaxy sample , 2016, 1607.03155.

[2]  Manodeep Sinha,et al.  Corrfunc: Blazing fast correlation functions on the CPU , 2017 .

[3]  J. Peacock,et al.  Power spectrum analysis of three-dimensional redshift surveys , 1993, astro-ph/9304022.

[4]  L. Wasserman,et al.  Fast Algorithms and Efficient Statistics: N-Point Correlation Functions , 2000, astro-ph/0012333.

[5]  Yu Feng,et al.  Theoretical Systematics of Future Baryon Acoustic Oscillation Surveys , 2017, Monthly Notices of the Royal Astronomical Society.

[6]  Uros Seljak,et al.  An optimal FFT-based anisotropic power spectrum estimator , 2017, 1704.02357.

[7]  Mario A. Storti,et al.  MPI for Python: Performance improvements and MPI-2 extensions , 2008, J. Parallel Distributed Comput..

[8]  Michael Pippig PFFT: An Extension of FFTW to Massively Parallel Architectures , 2013, SIAM J. Sci. Comput..

[9]  Yu Feng,et al.  Towards optimal extraction of cosmological information from nonlinear data , 2017, 1706.06645.

[10]  A. G. Alexei,et al.  OBSERVATIONAL EVIDENCE FROM SUPERNOVAE FOR AN ACCELERATING UNIVERSE AND A COSMOLOGICAL CONSTANT , 1998 .

[11]  M. Norman,et al.  yt: A MULTI-CODE ANALYSIS TOOLKIT FOR ASTROPHYSICAL SIMULATION DATA , 2010, 1011.3514.

[12]  M. Phillips,et al.  Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant , 1998, astro-ph/9805201.

[13]  Matias Zaldarriaga,et al.  Iterative initial condition reconstruction , 2017, 1704.06634.

[14]  George Efstathiou,et al.  Galaxy correlations on large scales , 1990 .

[15]  G. Efstathiou,et al.  The evolution of large-scale structure in a universe dominated by cold dark matter , 1985 .

[16]  A. Szalay,et al.  Bias and variance of angular correlation functions , 1993 .

[17]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[18]  R. Ellis,et al.  Measurements of $\Omega$ and $\Lambda$ from 42 high redshift supernovae , 1998, astro-ph/9812133.

[19]  Nick Hand,et al.  Launching Python Applications on Peta-scale Massively Parallel Systems , 2016 .

[20]  Uros Seljak,et al.  Extending the modeling of the anisotropic galaxy power spectrum to k = 0.4 hMpc−1 , 2017, 1706.02362.

[21]  Prasanth H. Nair,et al.  Astropy: A community Python package for astronomy , 2013, 1307.6212.

[22]  G. Bruce Berriman,et al.  Astrophysics Source Code Library , 2012, ArXiv.

[23]  P. Mcdonald,et al.  FastPM: a new scheme for fast simulations of dark matter and haloes , 2016, 1603.00476.

[24]  J. Lesgourgues,et al.  The Cosmic Linear Anisotropy Solving System (CLASS). Part II: Approximation schemes , 2011, 1104.2933.

[25]  F. V. D. Bosch,et al.  RECONSTRUCTING THE INITIAL DENSITY FIELD OF THE LOCAL UNIVERSE: METHODS AND TESTS WITH MOCK CATALOGS , 2013, 1301.1348.

[26]  U. Seljak,et al.  A Line of sight integration approach to cosmic microwave background anisotropies , 1996, astro-ph/9603033.

[27]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[28]  Marc Davis,et al.  A survey of galaxy redshifts. V. The two-point position and velocity correlations. , 1983 .

[29]  V. Springel,et al.  GADGET: a code for collisionless and gasdynamical cosmological simulations , 2000, astro-ph/0003162.

[30]  E. Tronci,et al.  1996 , 1997, Affair of the Heart.

[31]  Uros Seljak,et al.  Imprint of DESI fiber assignment on the anisotropic power spectrum of emission line galaxies , 2016, 1611.05007.

[32]  Yu Feng,et al.  Forecasts for the WFIRST High Latitude Survey using the BlueTides simulation , 2016, 1605.05670.