Toward high‐performance computational chemistry: II. A scalable self‐consistent field program

We discuss issues in developing scalable parallel algorithms and focus on the distribution, as opposed to the replication, of key data structures. Replication of large data structures limits the maximum calculation size by imposing a low ratio of processors to memory. Only applications which distribute both data and computation across processors are truly scalable. The use of shared data structures that may be independently accessed by each process even in a distributed memory environment greatly simplifies development and provides a significant performance enhancement. We describe tools we have developed to support this programming paradigm. These tools are used to develop a highly efficient and scalable algorithm to perform self‐consistent field calculations on molecular systems. A simple and classical strip‐mining algorithm suffices to achieve an efficient and scalable Fock matrix construction in which all matrices are fully distributed. By strip mining over atoms, we also exploit all available sparsity and pave the way to adopting more sophisticated methods for summation of the Coulomb and exchange interactions. © 1996 by John Wiley & Sons, Inc.

[1]  Nicholas Carriero,et al.  How to write parallel programs - a first course , 1990 .

[2]  W. Daniel Hillis,et al.  The connection machine , 1985 .

[3]  Giorgina Corongiu,et al.  Parallelism in quantum chemistry: Hydrogen bond study in DNA base pairs as an example , 1984 .

[4]  Robert J. Harrison,et al.  A parallel version of ARGOS: A distributed memory model for shared memory UNIX computers , 1991 .

[5]  Peter Otto,et al.  Parallelization and Vectorization of Quantum Mechanical Methods - I. Integral Program for Polymers and Molecules , 1993, Comput. Chem..

[6]  Richard A. Friesner,et al.  Pseudospectral Hartree–Fock theory: Applications and algorithmic improvements , 1990 .

[7]  Itai Panas,et al.  A fragment multipole approach to long-range Coulomb interactions in Hartree-Fock calculations on large systems , 1992 .

[8]  Robert J. Harrison,et al.  A parallel implementation of the COLUMBUS multireference configuration interaction program , 1993 .

[9]  Richard A. Friesner,et al.  Efficient Fock matrix diagonalization by a Krylov‐space method , 1993 .

[10]  Alistair P. Rendell,et al.  Distributed data parallel coupled‐cluster algorithm: Application to the 2‐hydroxypyridine/2‐pyridone tautomerism , 1993, J. Comput. Chem..

[11]  Michel Dupuis,et al.  Parallel computation of molecular energy gradients on the loosely coupled array of processors (LCAP) , 1987 .

[12]  Armin Burkhardt,et al.  SCF calculations on MIMD type parallel computers , 1993 .

[13]  Hans Peter Lüthi,et al.  A coarse‐grain parallel implementation of the direct SCF method , 1992 .

[14]  J. Almlöf,et al.  Integral approximations for LCAO-SCF calculations , 1993 .

[15]  J. Almlöf,et al.  Principles for a direct SCF approach to LICAO–MOab‐initio calculations , 1982 .

[16]  Rick A. Kendall,et al.  An efficient implementation of the direct-SCF algorithm on parallel computer architectures , 1993 .

[17]  Ron Shepard,et al.  Elimination of the diagonalization bottleneck in parallel Direct-SCF methods , 1993 .

[18]  Thomas R. Furlani,et al.  Implementation of a parallel direct SCF algorithm on distributed memory computers , 1995, J. Comput. Chem..

[19]  Michael E. Colvin,et al.  Parallel direct SCF for large-scale calculations , 1993 .