GPU Cluster for High Performance Computing

Inspired by the attractive Flops/dollar ratio and the incredible growth in the speed of modern graphics processing units (GPUs), we propose to use a cluster of GPUs for high performance scientific computing. As an example application, we have developed a parallel flow simulation using the lattice Boltzmann model (LBM) on a GPU cluster and have simulated the dispersion of airborne contaminants in the Times Square area of New York City. Using 30 GPU nodes, our simulation can compute a 480x400x80 LBM in 0.31 second/step, a speed which is 4.6 times faster than that of our CPU cluster implementation. Besides the LBM, we also discuss other potential applications of the GPU cluster, such as cellular automata, PDE solvers, and FEM.

[1]  J. Krüger,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, ACM Trans. Graph..

[2]  Dinesh Manocha,et al.  Interactive visibility culling in complex environments using occlusion-switches , 2003, I3D '03.

[3]  John C. Hart,et al.  The ray engine , 2002, HWWS '02.

[4]  John W. Backus,et al.  Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs , 1978, CACM.

[5]  S. Succi The Lattice Boltzmann Equation for Fluid Dynamics and Beyond , 2001 .

[6]  J. Boon The Lattice Boltzmann Equation for Fluid Dynamics and Beyond , 2003 .

[7]  P. Lallemand,et al.  Theory of the lattice Boltzmann method: acoustic and thermal properties in two and three dimensions. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Greg Humphreys,et al.  Chromium: a stream-processing framework for interactive rendering on clusters , 2002, SIGGRAPH.

[9]  Arie E. Kaufman,et al.  Implementing lattice Boltzmann computation on graphics hardware , 2003, The Visual Computer.

[10]  Suresh Venkatasubramanian The Graphics Card as a Streaming Computer , 2003, ArXiv.

[11]  Wei Shyy,et al.  Lattice Boltzmann Method for 3-D Flows with Curved Boundary , 2000 .

[12]  D. Wolf-Gladrow Lattice-Gas Cellular Automata and Lattice Boltzmann Models: An Introduction , 2000 .

[13]  Ronald Calhoun,et al.  MULTISCALE MODELING OF AIR FLOW IN SALT LAKE CITY AND THE SURROUNDING REGION , 2001 .

[14]  Adam H. Wilen,et al.  Introduction to PCI Express: A Hardware and Software Developer's Guide , 2003 .

[15]  Cho-Li Wang,et al.  Contention-Aware Communication Schedule for High-Speed Communication , 2003, Cluster Computing.

[16]  GrinspunEitan,et al.  Sparse matrix solvers on the GPU , 2003 .

[17]  Anselmo Lastra,et al.  Physically-based visual simulation on graphics hardware , 2002, HWWS '02.

[18]  John G. Hagedorn,et al.  Large Scale Simulations of Single and Multi-Component Flow in Porous Media | NIST , 1999 .

[19]  Mark Oskin,et al.  Using modern graphics architectures for general-purpose computing: a framework and analysis , 2002, MICRO 35.

[20]  Suresh Venkatasubramanian The Graphics Card as a Stream Computer , 2003 .

[21]  Mark Oskin,et al.  Using modern graphics architectures for general-purpose computing: a framework and analysis , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[22]  William R. Mark,et al.  Cg: a system for programming graphics hardware in a C-like language , 2003, ACM Trans. Graph..

[23]  D. d'Humières,et al.  Thirteen-velocity three-dimensional lattice Boltzmann model. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Sauro Succi,et al.  Go-with-the-Flow Lattice Boltzmann Methods for Tracer Dynamics , 2002 .

[25]  Jean-Marc Vincent,et al.  Physical cloth simulation on a PC cluster , 2002, EGPGV.

[26]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[27]  Gordon Stoll,et al.  WireGL: a scalable graphics system for clusters , 2001, SIGGRAPH.

[28]  Laurent Moll,et al.  Sepia: scalable 3D compositing using PCI Pamette , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).