CosmoFlow: Using Deep Learning to Learn the Universe at Scale

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩsubM/sub, σsub8/sub and nsubs/sub with unprecedented accuracy.

[1]  Thomas W. Jones,et al.  WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code , 2017, 1701.07452.

[2]  B. Yanny,et al.  Dark Energy Survey year 1 results: Cosmological constraints from galaxy clustering and weak lensing , 2017, Physical Review D.

[3]  Sander Dieleman,et al.  Rotation-invariant convolutional neural networks for galaxy morphology prediction , 2015, ArXiv.

[4]  Steven R. Young,et al.  Evolving Deep Networks Using HPC , 2017, MLHPC@SC.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[7]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[8]  Deborah Bard,et al.  Creating Virtual Universes Using Generative Adversarial Networks , 2017, ArXiv.

[9]  Yang You,et al.  Large Batch Training of Convolutional Networks , 2017, 1708.03888.

[10]  Michelle Lochner,et al.  Machine learning cosmological structure formation , 2018, Monthly Notices of the Royal Astronomical Society.

[11]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[12]  Moritz Müller,et al.  Deep Learning in Science , 2020, ArXiv.

[13]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[14]  Douglas Doerfler,et al.  Evaluating the networking characteristics of the Cray XC‐40 Intel Knights Landing‐based Cori supercomputer at NERSC , 2018, Concurr. Comput. Pract. Exp..

[15]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[16]  C. A. Oxborrow,et al.  Planck2015 results , 2015, Astronomy & Astrophysics.

[17]  Pradeep Dubey,et al.  Distributed Deep Learning Using Synchronous Stochastic Gradient Descent , 2016, ArXiv.

[18]  Barnabás Póczos,et al.  Enabling Dark Energy Science with Deep Generative Models of Galaxy Images , 2016, AAAI.

[19]  Oliver Hahn,et al.  Multi-scale initial conditions for cosmological simulations , 2011, 1103.6031.

[20]  Prabhat,et al.  Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer , 2017, ArXiv.

[21]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[22]  Matias Zaldarriaga,et al.  sCOLA: The N-body COLA Method Extended to the Spatial Domain , 2015, 1502.07751.

[23]  R. Wilson Modern Cosmology , 2004 .

[24]  Forrest N. Iandola,et al.  FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  B. Yanny,et al.  Dark Energy Survey year 1 results: Constraints on extended cosmological models from galaxy clustering and weak lensing , 2018, Physical Review D.

[26]  G. Börner,et al.  Typical scales in the distribution of galaxies and clusters of galaxies from unnormalized pair counts , 1992 .

[27]  Ioannis Mitliagkas,et al.  Deep Learning at 15PF : Supervised and Semi-Supervised Classification for Scientific Data , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[28]  Matias Zaldarriaga,et al.  Solving large scale structure in ten easy steps with COLA , 2013, 1301.0322.

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Thomas Hofmann,et al.  Cosmological model discrimination with Deep Learning , 2017, 1707.05167.

[31]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[32]  Nenghai Yu,et al.  Asynchronous Stochastic Gradient Descent with Delay Compensation , 2016, ICML.

[33]  Barnabás Póczos,et al.  Estimating Cosmological Parameters from the Dark Matter Distribution , 2016, ICML.