Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis
暂无分享,去创建一个
Sriram Krishnamoorthy | P. Sadayappan | Fabrice Rastello | Samyam Rajbhandari | Karol Kowalski | Samyam Rajbhandari | S. Krishnamoorthy | P. Sadayappan | F. Rastello | K. Kowalski
[1] Sriram Krishnamoorthy,et al. Practical Loop Transformations for Tensor Contraction Expressions on Multi-level Memory Hierarchies , 2011, CC.
[2] Guntram Rauhut,et al. Integral transformation with low‐order scaling for large local second‐order Møller–Plesset calculations , 1998 .
[3] Mark S. Gordon,et al. General atomic and molecular electronic structure system , 1993, J. Comput. Chem..
[4] Lawrence A. Covick,et al. Four‐Index transformation on distributed‐memory parallel computers , 1990 .
[5] Mark S. Gordon,et al. Parallel algorithm for integral transformations and GUGA MCSCF , 1994 .
[6] Mark S. Gordon,et al. DEVELOPMENTS IN PARALLEL ELECTRONIC STRUCTURE THEORY , 2007 .
[7] M. Pernpointner,et al. Parallelization of four‐component calculations. I. Integral generation, SCF, and four‐index transformation in the Dirac–Fock package MOLFDIR , 2000, J. Comput. Chem..
[8] Sriram Krishnamoorthy,et al. Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[9] Gianfranco Bilardi,et al. A Characterization of Temporal Locality and Its Portability across Memory Hierarchies , 2001, ICALP.
[10] Lucas Visscher,et al. Parallelization of four-component calculations. I. Integral generation, SCF, and four-index transformation in the Dirac-Fock package MOLFDIR , 2000, J. Comput. Chem..
[11] Matthew L. Leininger,et al. Psi4: an open‐source ab initio electronic structure program , 2012 .
[12] Robert J. Harrison,et al. Parallel direct four-index transformations , 1996 .
[13] Svein Saebo,et al. Avoiding the integral storage bottleneck in LCAO calculations of electron correlation , 1989 .
[14] Guntram Rauhut,et al. Integral transformation with low-order scaling for large local second-order Møller-Plesset calculations , 1998, J. Comput. Chem..
[15] Jarek Nieplocha,et al. Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit , 2006, Int. J. High Perform. Comput. Appl..
[16] S. Wilson. Four-Index Transformations , 1987 .
[17] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[18] K. Hirao,et al. A four-index transformation in Dirac's four-component relativistic theory , 2004 .
[19] Thomas Rauber,et al. Memory-optimal evaluation of expression trees involving large objects , 1999, Comput. Lang. Syst. Struct..
[20] Henry F. Schaefer,et al. Parallel algorithms for quantum chemistry. I. Integral transformations on a hypercube multiprocessor , 1987 .
[21] Shridhar R. Gadre,et al. A general parallel solution to the integral transformation and second‐order Mo/ller–Plesset energy evaluation on distributed memory parallel machines , 1994 .
[22] Yves Robert,et al. Matrix product on heterogeneous master-worker platforms , 2008, PPoPP.
[23] Thomas R. Furlani,et al. Implementation of a parallel direct SCF algorithm on distributed memory computers , 1995, J. Comput. Chem..
[24] Dror Irony,et al. Communication lower bounds for distributed-memory matrix multiplication , 2004, J. Parallel Distributed Comput..
[25] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[26] Sriram Krishnamoorthy,et al. Efficient Search-Space Pruning for Integrated Fusion and Tiling Transformations , 2005, LCPC.
[27] Tjerk P. Straatsma,et al. NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..