Toward generalized tensor algebra for ab initio quantum chemistry methods

The widespread use of tensor operations in describing electronic structure calculations has motivated the design of software frameworks for productive development of scalable optimized tensor-based electronic structure methods. Whereas prior work focused on Cartesian abstractions for dense tensors, we present an algebra to specify and perform tensor operations on a larger class of block-sparse tensors. We illustrate the use of this framework in expressing real-world computational chemistry calculations beyond the reach of existing frameworks.

[1]  Nikos D. Sidiropoulos,et al.  SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[2]  Sriram Krishnamoorthy,et al.  TTLG - An Efficient Tensor Transposition Library for GPUs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[3]  Anand D. Sarwate,et al.  A Unified Optimization Approach for Sparse Tensor Operations on GPUs , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[4]  Paolo Bientinesi,et al.  TTC: A high-performance Compiler for Tensor Transpositions , 2017, ACM Trans. Math. Softw..

[5]  Torsten Hoefler,et al.  Sparse Tensor Algebra as a Parallel Programming Model , 2015, ArXiv.

[6]  Jimeng Sun,et al.  Optimizing sparse tensor times matrix on GPUs , 2019, J. Parallel Distributed Comput..

[7]  Dmitry I. Lyakh,et al.  cuTT: A High-Performance Tensor Transpose Library for CUDA Compatible GPUs , 2017, ArXiv.

[8]  Evgeny Epifanovsky,et al.  A General Sparse Tensor Framework for Electronic Structure Theory. , 2017, Journal of chemical theory and computation.

[9]  Robert J. Harrison,et al.  MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation , 2015, SIAM J. Sci. Comput..

[10]  Paolo Bientinesi,et al.  TTC: a tensor transposition compiler for multiple architectures , 2016, ARRAY@PLDI.

[11]  Robert A. van de Geijn,et al.  FLAME: Formal Linear Algebra Methods Environment , 2001, TOMS.

[12]  Peter Ahrens,et al.  Tensor Algebra Compilation with Workspaces , 2019, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[13]  George Karypis,et al.  Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[14]  Frank Neese,et al.  Sparse maps--A systematic infrastructure for reduced-scaling electronic structure methods. II. Linear scaling domain based pair natural orbital coupled cluster theory. , 2016, The Journal of chemical physics.

[15]  Paolo Bientinesi,et al.  HPTT: a high-performance tensor transposition C++ library , 2017, ARRAY@PLDI.

[16]  David E. Bernholdt,et al.  Synthesis of High-Performance Parallel Programs for a Class of ab Initio Quantum Chemistry Models , 2005, Proceedings of the IEEE.

[17]  M. Head‐Gordon,et al.  A fifth-order perturbation comparison of electron correlation theories , 1989 .

[18]  Jimeng Sun,et al.  HiCOO: Hierarchical Storage of Sparse Tensors , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[19]  Hans-Joachim Werner,et al.  Parallel and Low-Order Scaling Implementation of Hartree-Fock Exchange Using Local Density Fitting. , 2016, Journal of chemical theory and computation.

[20]  Tjerk P. Straatsma,et al.  NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..

[21]  Tze Meng Low,et al.  Exploiting Symmetry in Tensors for High Performance: Multiplication with Symmetric Tensors , 2013, SIAM J. Sci. Comput..

[22]  Robert A. van de Geijn,et al.  Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.

[23]  S. Hirata Tensor Contraction Engine: Abstraction and Automated Parallel Implementation of Configuration-Interaction, Coupled-Cluster, and Many-Body Perturbation Theories , 2003 .

[24]  John F. Stanton,et al.  A massively parallel tensor contraction framework for coupled-cluster computations , 2014, J. Parallel Distributed Comput..

[25]  Dmitry I. Lyakh An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU , 2015, Comput. Phys. Commun..

[26]  Shoaib Kamil,et al.  The tensor algebra compiler , 2017, Proc. ACM Program. Lang..

[27]  Martin D. Schatz,et al.  Parallel Matrix Multiplication: A Systematic Journey , 2016, SIAM J. Sci. Comput..