Implementation of High-Order Multireference Coupled-Cluster Methods on Intel Many Integrated Core Architecture.

In this paper we discuss the implementation of multireference coupled-cluster formalism with singles, doubles, and noniterative triples (MRCCSD(T)), which is capable of taking advantage of the processing power of the Intel Xeon Phi coprocessor. We discuss the integration of two levels of parallelism underlying the MRCCSD(T) implementation with computational kernels designed to offload the computationally intensive parts of the MRCCSD(T) formalism to Intel Xeon Phi coprocessors. Special attention is given to the enhancement of the parallel performance by task reordering that has improved load balancing in the noniterative part of the MRCCSD(T) calculations. We also discuss aspects regarding efficient optimization and vectorization strategies.

[1]  Michael Hanrath,et al.  An exponential multi-reference wavefunction ansatz: connectivity analysis and application to N2 , 2008 .

[2]  Ming-Teh Hsu,et al.  Carbon cluster cations with up to 84 atoms: structures, formation mechanism, and reactivity , 1993 .

[3]  Francesco A Evangelista,et al.  Coupling term derivation and general implementation of state-specific multireference coupled cluster theories. , 2007, The Journal of chemical physics.

[4]  Leonid Oliker,et al.  Revolutionary technologies for acceleration of emerging petascale applications , 2009, Parallel Comput..

[5]  Jiří Pittner,et al.  Continuous transition between Brillouin-Wigner and Rayleigh-Schrödinger perturbation theory, generalized Bloch equation, and Hilbert space multireference coupled cluster , 2003 .

[6]  Karol Kowalski,et al.  Note: excited state studies of ozone using state-specific multireference coupled cluster methods. , 2012, The Journal of chemical physics.

[7]  Karol Kowalski,et al.  Universal state-selective corrections to multi-reference coupled-cluster theories with single and double excitations. , 2012, The Journal of chemical physics.

[8]  Kwangho Nam,et al.  Acceleration of Semiempirical Quantum Mechanical Calculations by Extended Lagrangian Molecular Dynamics Approach. , 2013, Journal of chemical theory and computation.

[9]  Klaus Schulten,et al.  Accelerating Molecular Modeling Applications with GPU Computing , 2009 .

[10]  Wataru Shinoda,et al.  Micellization Studied by GPU-Accelerated Coarse-Grained Molecular Dynamics. , 2011, Journal of chemical theory and computation.

[11]  Liguo Kong,et al.  Connection between a few Jeziorski‐Monkhorst ansatz‐based methods , 2009 .

[12]  Rodney J. Bartlett,et al.  Hilbert space multireference coupled-cluster methods. II: A model study on H8 , 1992 .

[13]  Michael Hanrath,et al.  An exponential multireference wave-function Ansatz. , 2005, The Journal of chemical physics.

[14]  Koji Yasuda,et al.  Accelerating Density Functional Calculations with Graphics Processing Unit. , 2008, Journal of chemical theory and computation.

[15]  Frank Neese,et al.  A Local Pair Natural Orbital-Based Multireference Mukherjee's Coupled Cluster Method. , 2015, Journal of chemical theory and computation.

[16]  Vijay S. Pande,et al.  Accelerating molecular dynamic simulation on graphics processing units , 2009, J. Comput. Chem..

[17]  Julio Daniel Carvalho Maia,et al.  GPU Linear Algebra Libraries and GPGPU Programming for Accelerating MOPAC Semiempirical Quantum Chemistry Calculations. , 2012, Journal of chemical theory and computation.

[18]  Ivan S Ufimtsev,et al.  Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation. , 2008, Journal of chemical theory and computation.

[19]  Alán Aspuru-Guzik,et al.  Accelerating Correlated Quantum Chemistry Calculations Using Graphical Processing Units , 2010, Computing in Science & Engineering.

[20]  Michael Klemm,et al.  Efficient Implementation of Many-Body Quantum Chemical Methods on the Intel® Xeon Phi Coprocessor , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[21]  Jiří Pittner,et al.  Method of moments for the continuous transition between the Brillouin–Wigner-type and Rayleigh–Schrödinger-type multireference coupled cluster theories , 2009 .

[22]  Kiran Bhaskaran-Nair,et al.  Multireference state-specific Mukherjee's coupled cluster method with noniterative triexcitations using uncoupled approximation. , 2011, The Journal of chemical physics.

[23]  Vijay S. Pande,et al.  Efficient nonbonded interactions for molecular dynamics on a graphics processing unit , 2010, J. Comput. Chem..

[24]  Jürgen Gauss,et al.  Triple excitations in state-specific multireference coupled cluster theory: application of Mk-MRCCSDT and Mk-MRCCSDT-n methods to model systems. , 2008, The Journal of chemical physics.

[25]  Karol Kowalski,et al.  Bridging single and multireference coupled cluster theories with universal state selective formalism. , 2013, The Journal of chemical physics.

[26]  Jörg Kussmann,et al.  Preselective Screening for Linear-Scaling Exact Exchange-Gradient Calculations for Graphics Processing Units and General Strong-Scaling Massively Parallel Calculations. , 2015, Journal of chemical theory and computation.

[27]  Petr Čársky,et al.  Efficient evaluation of exchange integrals by means of Fourier transform of the 1/r operator and its numerical quadrature , 2014, Theoretical Chemistry Accounts.

[28]  Francesco A Evangelista,et al.  Perturbative triples corrections in state-specific multireference coupled cluster theory. , 2010, The Journal of chemical physics.

[29]  Christine M. Isborn,et al.  Excited-State Electronic Structure with Configuration Interaction Singles and Tamm–Dancoff Time-Dependent Density Functional Theory on Graphical Processing Units , 2011, Journal of chemical theory and computation.

[30]  Ivan S Ufimtsev,et al.  Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics. , 2009, Journal of chemical theory and computation.

[31]  Uttam Sinha Mahapatra,et al.  A state-specific multi-reference coupled cluster formalism with molecular applications , 1998 .

[32]  Leszek Meissner,et al.  A coupled‐cluster method for quasidegenerate states , 1988 .

[33]  Che Ting Chan,et al.  The geometry of small fullerene cages: C20 to C70 , 1992 .

[34]  Karol Kowalski,et al.  Iterative universal state selective correction for the Brillouin-Wigner multireference coupled-cluster theory. , 2015, The Journal of chemical physics.

[35]  H. Monkhorst,et al.  Coupled-cluster method for multideterminantal reference states , 1981 .

[36]  Klaus Schulten,et al.  GPU-accelerated molecular modeling coming of age. , 2010, Journal of molecular graphics & modelling.

[37]  Chuanlu Yang,et al.  First-principles study of structure and quantum transport properties of C20 fullerene. , 2009, The Journal of chemical physics.

[38]  Takeshi Yoshikawa,et al.  Linear‐scaling self‐consistent field calculations based on divide‐and‐conquer method using resolution‐of‐identity approximation on graphical processing units , 2015, J. Comput. Chem..

[39]  Giulia Galli,et al.  Tight-binding molecular dynamics for carbon systems: Fullerenes on surfaces , 1998 .

[40]  Uttam Sinha Mahapatra,et al.  State-Specific Multi-Reference Coupled Cluster Formulations: Two Paradigms , 1998 .

[41]  Fumio Hirata,et al.  Modified Anderson Method for Accelerating 3D-RISM Calculations Using Graphics Processing Unit. , 2012, Journal of chemical theory and computation.

[42]  Ivan S. Ufimtsev,et al.  An atomic orbital-based formulation of the complete active space self-consistent field method on graphical processing units. , 2015, The Journal of chemical physics.

[43]  Benjamin G. Levine,et al.  Nanoscale multireference quantum chemistry: full configuration interaction on graphical processing units. , 2015, Journal of chemical theory and computation.

[44]  Karol Kowalski,et al.  Extension of the method of moments of coupled-cluster equations to a multireference wave operator formalism ☆ , 2001 .

[45]  Joshua A. Anderson,et al.  General purpose molecular dynamics simulations fully implemented on graphics processing units , 2008, J. Comput. Phys..

[46]  Brett M. Bode,et al.  Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units. , 2010, Journal of chemical theory and computation.

[47]  Yifan Jin,et al.  Coupled cluster geometries and energies of C20 carbon cluster isomers – A new benchmark study , 2015 .

[48]  Rodney J. Bartlett,et al.  Hilbert space multireference coupled-cluster methods. I: The single and double excitation model , 1991 .

[49]  Ivan S Ufimtsev,et al.  Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation. , 2009, Journal of chemical theory and computation.

[50]  Josef Paldus,et al.  Orthogonally spin-adapted multi-reference Hilbert space coupled-cluster formalism: diagrammatic formulation , 1992 .

[51]  Josef Paldus,et al.  Spin‐adapted multireference coupled‐cluster approach: Linear approximation for two closed‐shell‐type reference configurations , 1988 .

[52]  T. H. Dunning Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen , 1989 .

[53]  Andreas W. Götz,et al.  SPFP: Speed without compromise - A mixed precision model for GPU accelerated molecular dynamics simulations , 2013, Comput. Phys. Commun..

[54]  Chris-Kriton Skylaris,et al.  Porting ONETEP to graphical processing unit‐based coprocessors. 1. FFT box operations , 2013, J. Comput. Chem..

[55]  Koji Yasuda,et al.  Two‐electron integral evaluation on the graphics processor unit , 2008, J. Comput. Chem..

[56]  Peter R. Taylor Eric Bylaska,et al.  C20: Fullerene, Bowl or Ring? New Results from Coupled-Cluster Calculations , 1995 .

[57]  Oreste Villa,et al.  Noniterative Multireference Coupled Cluster Methods on Heterogeneous CPU-GPU Systems. , 2013, Journal of chemical theory and computation.

[58]  Sriram Krishnamoorthy,et al.  Scalable implementations of accurate excited-state coupled cluster theories: Application of high-level methods to porphyrin-based systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[59]  Jan M. L. Martin,et al.  On the structure and vibrational frequencies of C20 , 1996 .

[60]  Xin Wu,et al.  Semiempirical Quantum Chemical Calculations Accelerated on a Hybrid Multicore CPU-GPU Computing Platform. , 2012, Journal of chemical theory and computation.

[61]  Josef Paldus,et al.  General-model-space state-universal coupled-cluster methods for excited states: diagonal noniterative triple corrections. , 2006, The Journal of chemical physics.

[62]  Hideo Sekino,et al.  A screened potential molecular‐orbital calculation of the π‐electron system of porphyrin , 1981 .

[63]  Kenneth M. Merz,et al.  Acceleration of High Angular Momentum Electron Repulsion Integrals and Integral Derivatives on Graphics Processing Units. , 2015, Journal of chemical theory and computation.

[64]  Sanghamitra Das,et al.  Full implementation and benchmark studies of Mukherjee's state-specific multireference coupled-cluster ansatz. , 2010, The Journal of chemical physics.

[65]  Rodney J. Bartlett,et al.  A multireference coupled‐cluster study of the ground state and lowest excited states of cyclobutadiene , 1994 .

[66]  Rodney J. Bartlett,et al.  The multi-reference Hilbert space coupled-cluster study of the Li2 molecule. Application in a complete model space , 1991 .

[67]  J. Pople,et al.  Self—Consistent Molecular Orbital Methods. XII. Further Extensions of Gaussian—Type Basis Sets for Use in Molecular Orbital Studies of Organic Molecules , 1972 .

[68]  S. Hirata Tensor Contraction Engine: Abstraction and Automated Parallel Implementation of Configuration-Interaction, Coupled-Cluster, and Many-Body Perturbation Theories , 2003 .

[69]  Edmond Chow,et al.  Parallel scalability of Hartree-Fock calculations. , 2015, The Journal of chemical physics.

[70]  Stefan Grimme,et al.  Structural isomers of C20 revisited: the cage and bowl are almost isoenergetic. , 2002, Chemphyschem : a European journal of chemical physics and physical chemistry.

[71]  Rodney J. Bartlett,et al.  A Hilbert space multi-reference coupled-cluster study of the H4 model system , 1991 .

[72]  Michael Hanrath,et al.  Initial applications of an exponential multi-reference wavefunction ansatz , 2006 .

[73]  Gustavo E. Scuseria,et al.  Isomers of C20. Dramatic effect of gradient corrections in density functional theory , 1993 .

[74]  Petr Nachtigall,et al.  Assessment of the single-root multireference Brillouin–Wigner coupled- cluster method: Test calculations on CH2, SiH2, and twisted ethylene , 1999 .

[75]  Arne Lüchow,et al.  Energetics of carbon clusters C20 from all-electron quantum Monte Carlo calculations , 2000 .

[76]  Kiran Bhaskaran-Nair,et al.  Multireference state-specific Mukherjee's coupled cluster method with noniterative triexcitations. , 2008, The Journal of chemical physics.

[77]  Wei An,et al.  Ab initio calculation of bowl, cage, and ring isomers of C20 and C20-. , 2005, The Journal of chemical physics.

[78]  Uttam Sinha Mahapatra,et al.  A size-consistent state-specific multireference coupled cluster theory: Formal developments and molecular applications , 1999 .

[79]  A Eugene DePrince,et al.  Coupled Cluster Theory on Graphics Processing Units I. The Coupled Cluster Doubles Method. , 2011, Journal of chemical theory and computation.

[80]  Bobby G. Sumpter,et al.  Density-fitted singles and doubles coupled cluster on graphics processing units , 2014 .

[81]  Weitao Yang,et al.  Structural manifestation of the delocalization error of density functional approximations: C(4N+2) rings and C(20) bowl, cage, and ring isomers. , 2010, The Journal of chemical physics.

[82]  Sanghamitra Das,et al.  Inclusion of selected higher excitations involving active orbitals in the state-specific multireference coupled-cluster theory. , 2010, The Journal of chemical physics.

[83]  J. Grossman,et al.  Structure and stability of molecular carbon: Importance of electron correlation. , 1995, Physical review letters.

[84]  Beverly A. Sanders,et al.  Exploiting GPUs with the Super Instruction Architecture , 2014, International Journal of Parallel Programming.

[85]  Yihan Shao,et al.  Accelerating resolution-of-the-identity second-order Møller-Plesset quantum chemistry calculations with graphical processing units. , 2008, The journal of physical chemistry. A.

[86]  Karol Kowalski,et al.  New classes of non-iterative energy corrections to multi-reference coupled-cluster energies , 2004 .

[87]  Klaus Schulten,et al.  Multilevel summation of electrostatic potentials using graphics processing units , 2009, Parallel Comput..

[88]  Ivan Hubač,et al.  Multireference Brillouin-Wigner Coupled-Cluster Theory. Single-root approach. , 1998 .

[89]  Edoardo Aprà,et al.  Implementation of the multireference Brillouin-Wigner and Mukherjee's coupled cluster methods with non-iterative triple excitations utilizing reference-level parallelism. , 2012, The Journal of chemical physics.

[90]  Piecuch,et al.  Application of Hilbert-space coupled-cluster theory to simple (H2)2 model systems: Planar models. , 1993, Physical review. A, Atomic, molecular, and optical physics.

[91]  Eric J. Bylaska,et al.  LDA Predictions of C20 Isomerizations: Neutral and Charged Species , 1996 .

[92]  Karol Kowalski,et al.  A universal state-selective approach to multireference coupled-cluster non-iterative corrections. , 2011, The Journal of chemical physics.

[93]  Duncan Poole,et al.  Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1. Generalized Born , 2012, Journal of chemical theory and computation.

[94]  A. Arnold,et al.  Harvesting graphics power for MD simulations , 2007, 0709.3225.

[95]  S A Maurer,et al.  Communication: A reduced scaling J-engine based reformulation of SOS-MP2 using graphics processing units. , 2014, The Journal of chemical physics.

[96]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[97]  Sriram Krishnamoorthy,et al.  GPU-Based Implementations of the Noniterative Regularized-CCSD(T) Corrections: Applications to Strongly Correlated Systems. , 2011, Journal of chemical theory and computation.

[98]  J. Pittner,et al.  Multireference Brillouin-Wigner coupled clusters method with noniterative perturbative connected triples. , 2006, The Journal of chemical physics.