Evaluating and Optimizing the NERSC Workload on Knights Landing

NERSC has partnered with 20 representative application teams to evaluate performance on the Xeon-Phi Knights Landing architecture and develop an application-optimization strategy for the greater NERSC workload on the recently installed Cori system. In this article, we present early case studies and summarized results from a subset of the 20 applications highlighting the impact of important architecture differences between the Xeon-Phi and traditional Xeon processors. We summarize the status of the applications and describe the greater optimization strategy that has formed.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Choong-Seock Chang,et al.  Full-f gyrokinetic particle simulation of centrally heated global ITG turbulence from magnetic axis to edge pedestal top in a realistic tokamak geometry , 2009 .

[3]  Samuel Williams,et al.  Auto-tuning performance on multicore computers , 2008 .

[4]  Donald G Truhlar,et al.  Density functional theory for transition metals and transition metal chemistry. , 2009, Physical chemistry chemical physics : PCCP.

[5]  Andrew Gettelman,et al.  A new two-moment bulk stratiform cloud microphysics scheme in the Community Atmosphere Model, version 3 (CAM3). Part I: Description and numerical tests , 2008 .

[6]  A. S. Almgren,et al.  MAESTRO: AN ADAPTIVE LOW MACH NUMBER HYDRODYNAMICS ALGORITHM FOR STELLAR FLOWS , 2010, 1005.0112.

[7]  Jean M. Sexton,et al.  Nyx: A MASSIVELY PARALLEL AMR CODE FOR COMPUTATIONAL COSMOLOGY , 2013, J. Open Source Softw..

[8]  William L. Briggs,et al.  A multigrid tutorial, Second Edition , 2000 .

[9]  R. Sasanka,et al.  An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes , 2016, Comput. Phys. Commun..

[10]  Samuel Williams,et al.  Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor , 2016, ISC Workshops.

[11]  K. Burke Perspective on density functional theory. , 2012, The Journal of chemical physics.

[12]  M. White,et al.  The Lyman α forest in optically thin hydrodynamical simulations , 2014, 1406.6361.

[13]  C. S. Chang,et al.  A Fokker-Planck-Landau collision equation solver on two-dimensional velocity grid and its application to particle-in-cell simulation , 2014 .

[14]  Ivan Duchemin,et al.  A scalable and accurate algorithm for the computation of Hartree-Fock exchange , 2010, Comput. Phys. Commun..

[15]  A. Simmons,et al.  An Energy and Angular-Momentum Conserving Vertical Finite-Difference Scheme and Hybrid Vertical Coordinates , 1981 .

[16]  L. H. Howell,et al.  CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. I. HYDRODYNAMICS AND SELF-GRAVITY , 2010, 1005.0114.

[17]  Samuel Williams,et al.  Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .

[18]  Marcus Day,et al.  A high-order spectral deferred correction strategy for low Mach number flow with complex chemistry , 2015, 1512.06459.

[19]  Weitao Yang,et al.  Challenges for density functional theory. , 2012, Chemical reviews.

[20]  Stefano de Gironcoli,et al.  QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials , 2009, Journal of physics. Condensed matter : an Institute of Physics journal.

[21]  Lin Lin,et al.  Adaptively Compressed Exchange Operator. , 2016, Journal of chemical theory and computation.

[22]  William L. Briggs,et al.  A multigrid tutorial , 1987 .

[23]  M. Taylor The Spectral Element Method for the Shallow Water Equations on the Sphere , 1997 .

[24]  Scott B. Baden,et al.  Hard scaling challenges for ab initio molecular dynamics capabilities in NWChem: Using 100,000 CPUs per second , 2009 .

[25]  C. S. Chang,et al.  Erratum: “A Fokker-Planck-Landau collision equation solver on two-dimensional velocity grid and its application to particle-in-cell simulation” [Phys. Plasmas 21, 032503 (2014)] , 2014 .

[26]  Andrew Gettelman,et al.  Advanced two-moment bulk microphysics for global models. Part I: off-line tests and comparison with other schemes. , 2015 .

[27]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[28]  Y. Sun,et al.  Low Mach Number Fluctuating Hydrodynamics of Binary Liquid Mixtures , 2014, 1410.2300.

[29]  George Shu Heng Pau,et al.  An adaptive mesh refinement algorithm for compressible two-phase flow in porous media , 2012, Computational Geosciences.

[30]  Adrienn Ruzsinszky,et al.  Some Fundamental Issues in Ground-State Density Functional Theory: A Guide for the Perplexed. , 2009, Journal of chemical theory and computation.

[31]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[32]  John Shalf,et al.  BoxLib with Tiling: An AMR Software Framework , 2016, ArXiv.

[33]  Roi Baer,et al.  Tuned range-separated hybrids in density functional theory. , 2010, Annual review of physical chemistry.

[34]  P. Woodward,et al.  The Piecewise Parabolic Method (PPM) for Gas Dynamical Simulations , 1984 .

[35]  Matemática,et al.  Society for Industrial and Applied Mathematics , 2010 .

[36]  P. Hohenberg,et al.  Inhomogeneous Electron Gas , 1964 .

[37]  A. Becke Perspective: Fifty years of density-functional theory in chemical physics. , 2014, The Journal of chemical physics.

[38]  Patrick H. Worley,et al.  A fully non-linear multi-species Fokker-Planck-Landau collision operator for simulation of fusion plasma , 2016, J. Comput. Phys..

[39]  Phillip Colella,et al.  Efficient Solution Algorithms for the Riemann Problem for Real Gases , 1985 .

[40]  Thorsten Kurth,et al.  Improved treatment of exact exchange in Quantum ESPRESSO , 2017, Comput. Phys. Commun..