Enhancing Scalability of a Matrix-Free Eigensolver for Studying Many-Body Localization

In [Van Beeumen, et. al, HPC Asia 2020, https://www.doi.org/10.1145/3368474.3368497] a scalable and matrix-free eigensolver was proposed for studying the many-body localization (MBL) transition of two-level quantum spin chain models with nearest-neighbor $XX+YY$ interactions plus $Z$ terms. This type of problem is computationally challenging because the vector space dimension grows exponentially with the physical system size, and averaging over different configurations of the random disorder is needed to obtain relevant statistical behavior. For each eigenvalue problem, eigenvalues from different regions of the spectrum and their corresponding eigenvectors need to be computed. Traditionally, the interior eigenstates for a single eigenvalue problem are computed via the shift-and-invert Lanczos algorithm. Due to the extremely high memory footprint of the LU factorizations, this technique is not well suited for large number of spins $L$, e.g., one needs thousands of compute nodes on modern high performance computing infrastructures to go beyond $L = 24$. The matrix-free approach does not suffer from this memory bottleneck, however, its scalability is limited by a computation and communication imbalance. We present a few strategies to reduce this imbalance and to significantly enhance the scalability of the matrix-free eigensolver. To optimize the communication performance, we leverage the consistent space runtime, CSPACER, and show its efficiency in accelerating the MBL irregular communication patterns at scale compared to optimized MPI non-blocking two-sided and one-sided RMA implementation variants. The efficiency and effectiveness of the proposed algorithm is demonstrated by computing eigenstates on a massively parallel many-core high performance computer.

[1]  Immanuel Bloch,et al.  Probing Slow Relaxation and Many-Body Localization in Two-Dimensional Quasiperiodic Systems , 2017, 1704.03063.

[2]  Pieter Ghysels,et al.  A Robust Parallel Preconditioner for Indefinite Systems Using Hierarchical Matrices and Randomized Sampling , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[3]  Jae-yoon Choi,et al.  Exploring the many-body localization transition in two dimensions , 2016, Science.

[4]  Lin-wang Wang,et al.  Solving Schrödinger’s equation around a desired energy: Application to silicon quantum dots , 1994 .

[5]  B. Bauer,et al.  Area laws in a many-body localized state and its implications for topological order , 2013, 1306.5753.

[6]  M. Schreiber,et al.  Observation of many-body localization of interacting fermions in a quasirandom optical lattice , 2015, Science.

[7]  D. Huse,et al.  Many-body localization phase transition , 2010, 1010.1992.

[8]  F. Alet,et al.  Shift-invert diagonalization of large many-body localizing spin chains , 2018, SciPost Physics.

[9]  Sonika Johri,et al.  Many-body localization in imperfectly isolated quantum systems. , 2015, Physical review letters.

[10]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[11]  T. Prosen,et al.  Quantum chaos challenges many-body localization. , 2019, Physical review. E.

[12]  Khaled Ibrahim Optimizing Breadth-First Search at Scale Using Hardware-Accelerated Space Consistency , 2019, 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC).

[13]  F. Alet,et al.  Many-body localization edge in the random-field Heisenberg chain , 2014, 1411.0660.

[14]  M. Rispoli,et al.  Probing entanglement in a many-body–localized system , 2018, Science.

[15]  S. Das Sarma,et al.  Observation of Many-Body Localization in a One-Dimensional System with a Single-Particle Mobility Edge. , 2018, Physical review letters.

[16]  Torsten Hoefler,et al.  Remote Memory Access Programming in MPI-3 , 2015, TOPC.

[17]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[18]  J. Bardarson,et al.  Distinguishing localization from chaos: Challenges in finite-size systems , 2019, 1911.04501.

[19]  Ming Gu,et al.  A Robust and Efficient Implementation of LOBPCG , 2018, SIAM J. Sci. Comput..

[20]  P. Anderson Absence of Diffusion in Certain Random Lattices , 1958 .

[21]  M. Mézard,et al.  Level statistics of disordered spin-1/2 systems and materials with localized Cooper pairs , 2012, Nature Communications.

[22]  Bronis R. de Supinski,et al.  Runtime Correctness Analysis of MPI-3 Nonblocking Collectives , 2016, EuroMPI.

[23]  M. Muller,et al.  Absence of many-body mobility edges , 2015, 1506.01505.

[24]  Aaron C. E. Lee,et al.  Many-body localization in a quantum simulator with programmable random disorder , 2015, Nature Physics.

[25]  D. Huse,et al.  Localization of interacting fermions at high temperature , 2006, cond-mat/0610854.

[26]  Chao Yang,et al.  A Scalable Matrix-Free Iterative Eigensolver for Studying Many-Body Localization , 2020, HPC Asia.