Novo‐G#: a multidimensional torus‐based reconfigurable cluster for molecular dynamics

Molecular dynamics (MD) is a large‐scale, communication‐intensive problem that has been the subject of high‐performance computing research and acceleration for years. Not surprisingly, the most success in accelerating MD comes from specialized systems such as the Anton machine. In this paper, we describe Novo‐G# (novo‐jee‐sharp), a multi‐node reconfigurable system designed for the acceleration of communication‐intensive scientific problems in general, and MD in particular. This system provides a high‐bandwidth, low‐latency 3D torus network to allow direct communication between kernels running on multiple field‐programmable gate arrays. We also present a performance model for Novo‐G# running the 3D Fast Fourier Transform (FFT) kernel that forms the core of MD simulations. We validate the model against published Anton performance data and through initial hardware experiments on Novo‐G#. Finally, through simulation studies, we show that this system at scale performs better than specialized systems like Anton and outperforms established CPU‐based clusters like Blue Gene/Q by an order of magnitude for the 3D FFT kernel, with greater flexibility and lower costs. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  J. P. Grossman,et al.  A 32x32x32, spatially distributed 3D FFT in four microseconds on Anton , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[2]  John Kim,et al.  Router microarchitecture and scalability of ring topology in on-chip networks , 2009, 2009 2nd International Workshop on Network on Chip Architectures.

[3]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[4]  Alan D. George,et al.  A scalable RC architecture for mean-shift clustering , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[5]  Yong Wang,et al.  SDA: Software-defined accelerator for large-scale DNN systems , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).

[6]  R. Dror,et al.  Gaussian split Ewald: A fast Ewald mesh method for molecular simulation. , 2005, The Journal of chemical physics.

[7]  H. Lam,et al.  FPGA-Accelerated Isotope Pattern Calculator for Use in Simulated Mass Spectrometry Peptide and Protein Chemistry , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.

[8]  Alan D. George,et al.  Novo-G: At the Forefront of Scalable Reconfigurable Supercomputing , 2011, Computing in Science & Engineering.

[9]  J. P. Grossman,et al.  Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[10]  Tomohiro Inoue,et al.  The Tofu Interconnect , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[11]  J. P. Grossman,et al.  Incorporating flexibility in Anton, a specialized machine for molecular dynamics simulation , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[12]  John L. Klepeis,et al.  Millisecond-scale molecular dynamics simulations on Anton , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[13]  J. P. Grossman,et al.  Unifying on-chip and inter-node switching within the Anton 2 network , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[14]  Alan D. George,et al.  Simulative Analysis of a Multidimensional Torus-Based Reconfigurable Cluster for Molecular Dynamics , 2014, 2014 43rd International Conference on Parallel Processing Workshops.

[15]  Martin J. Gander,et al.  Modern Methods in Scientific Computing and Applications , 2002 .

[16]  Alan D. George,et al.  A parallel hardware architecture for information-theoretic adaptive filtering , 2010, 2010 FOURTH INTERNATIONAL WORKSHOP ON HIGH-PERFORMANCE RECONFIGURABLE COMPUTING TECHNOLOGY AND APPLICATIONS (HPRCTA).

[17]  A. George,et al.  Reconfigurable Computing Architecture for Accurate Disparity Map Calculation in Real-Time Stereo Vision , 2013 .

[18]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[19]  Benjamin Humphries,et al.  Design of 3D FFTs with FPGA clusters , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[20]  Jianwen Zhu,et al.  Saturating the transceiver bandwidth: switch fabric design on FPGAs , 2012, FPGA '12.

[21]  B. Alder,et al.  Studies in Molecular Dynamics. I. General Method , 1959 .

[22]  Federico D. Sacerdoti,et al.  Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[23]  Donghyun Kim,et al.  A reconfigurable crossbar switch with adaptive bandwidth control for networks-on-chip , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[24]  Sameer Kumar,et al.  Acceleration of an Asynchronous Message Driven Programming Paradigm on IBM Blue Gene/Q , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[25]  Robert S. Germain,et al.  Blue Matter: Strong Scaling of Molecular Dynamics on Blue Gene/L , 2006, International Conference on Computational Science.

[26]  Abhinav Vishnu,et al.  Evaluating the Potential of Cray Gemini Interconnect for PGAS Communication Runtime Systems , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[27]  John L. Klepeis,et al.  Anton, a special-purpose machine for molecular dynamics simulation , 2007, ISCA '07.

[28]  J. P. Grossman,et al.  Hardware support for fine-grained event-driven computation in Anton 2 , 2013, ASPLOS '13.

[29]  Edmond Chow,et al.  Exploiting 162-Nanosecond End-to-End Communication Latency on Anton , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  Alan D. George,et al.  BSW: FPGA-accelerated BLAST-Wrapped Smith-Waterman aligner , 2013, 2013 International Conference on Reconfigurable Computing and FPGAs (ReConFig).

[31]  R. Sridharan,et al.  FPGA-Based Reconfigurable Computing for Pricing Multi-asset Barrier Options , 2012, 2012 Symposium on Application Accelerators in High Performance Computing.

[32]  Alan D. George,et al.  Novo-G: A View at the HPC Crossroads for Scientific Computing , 2010, ERSA.

[33]  D. W. Noid Studies in Molecular Dynamics , 1976 .

[34]  Ryutaro Himeno,et al.  A 55 TFLOPS simulation of amyloid-forming peptides from yeast prion Sup35 with the special-purpose computer system MDGRAPE-3 , 2006, SC.

[35]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .