Memory-Access Optimization of Parallel Molecular Dynamics Simulation via Dynamic Data Reordering

Dynamic irregular applications such as molecular dynamics (MD) simulation often suffer considerable performance deterioration during execution. To address this problem, an optimal data-reordering schedule has been developed for runtime memory-access optimization of MD simulations on parallel computers. Analysis of the memory-access penalty during MD simulations shows that the performance improvement from computation and data reordering degrades gradually as data translation lookaside buffer misses increase. We have also found correlations between the performance degradation with physical properties such as the simulated temperature, as well as with computational parameters such as the spatial-decomposition granularity. Based on a performance model and pre-profiling of data fragmentation behaviors, we have developed an optimal runtime data-reordering schedule, thereby archiving speedup of 1.35, 1.36 and 1.28, respectively, for MD simulations of silica at temperatures 300 K, 3,000 K and 6,000 K.

[1]  Ron O. Dror,et al.  Exploring atomic resolution physiology on a femtosecond to millisecond timescale using molecular dynamics simulations , 2010, The Journal of general physiology.

[2]  Rajiv K. Kalia,et al.  Fast reaction mechanism of a core(Al)-shell (Al2O3) nanoparticle in oxygen , 2009 .

[3]  Sidney Yip,et al.  Computing the viscosity of supercooled liquids. , 2009, The Journal of chemical physics.

[4]  Laxmikant V. Kalé,et al.  NAMD: Biomolecular Simulation on Thousands of Processors , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[5]  David E. Shaw,et al.  A fast, scalable method for the parallel evaluation of distance‐limited pairwise particle interactions , 2005, J. Comput. Chem..

[6]  Weiqiang Wang,et al.  A Scalable Hierarchical Parallelization Framework for Molecular Dynamics Simulation on Multicore Clusters , 2009, PDPTA.

[7]  Priya Vashishta,et al.  Void deformation and breakup in shearing silica glass. , 2009, Physical review letters.

[8]  Gui-Rong Liu,et al.  Improved neighbor list algorithm in molecular simulations using cell decomposition and data sorting method , 2004, Comput. Phys. Commun..

[9]  Weiqiang Wang,et al.  A metascalable computing framework for large spatiotemporal-scale atomistic simulations , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10]  Ken Kennedy,et al.  Improving Memory Hierarchy Performance for Irregular Applications Using Data and Computation Reorderings , 2001, International Journal of Parallel Programming.

[11]  Alan L. Cox,et al.  Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[12]  Ron O. Dror,et al.  Perspectives on : Molecular dynamics and computational methods Exploring atomic resolution physiology on a femtosecond to millisecond timescale using molecular dynamics simulations , .

[13]  Chau-Wen Tseng,et al.  Exploiting locality for irregular scientific codes , 2006, IEEE Transactions on Parallel and Distributed Systems.