Parallel Lattice Boltzmann Flow Simulation on a Low-cost PlayStation3 Cluster

A scalable parallel algorithm has been designed to perform large-scale flow simulations based on the lattice Boltzmann method. The algorithm combines hierarchical spatial decomposition and a critical section-free, dual representation to expose maximal concurrency and data locality, thereby achieving isogranular parallel efficiency of 0.977 on 65,536 IBM BlueGene/L nodes. A hybrid thread + message passing programming is employed to implement the algorithm on a low-cost (~5,000 US dollars) Linux cluster consisting of 9 PlayStation3 consoles (based on the Cell Broadband Engine architecture) connected via a Gigabit Ethernet switch. The program achieves high multithreading parallel efficiency (0.882) on 6 Synergistic Processing Elements (SPEs) and performance improvement of factor 13.2 over a conventional PowerPC processor within each PlayStation3 console. Despite the limited bandwidth of the low-cost Ethernet switch, the program achieves reasonable (0.704) inter-console parallel efficiency. * Corresponding Author. Email: anakano@usc.edu. International Journal of Computational Science 1992-6669 (Print) 1992-6677 (Online) www.gip.hk/ijcs © 2008 Global Information Publisher (H.K) Co., Ltd. 2008, Vol. 2, No. 4, 437-449. GLOBAL INFORMATION PUBLISHER 437

[1]  Jos Derksen,et al.  Parallel Fluid Flow Simulations by Means of a Lattice-Boltzmann Scheme , 1997, HPCN Europe.

[2]  John Shalf,et al.  The new landscape of parallel computer architecture , 2007 .

[3]  Priya Vashishta,et al.  Interaction of voids and nanoductility in silica glass. , 2007, Physical review letters.

[4]  Arie E. Kaufman,et al.  Implementing lattice Boltzmann computation on graphics hardware , 2003, The Visual Computer.

[5]  Daniel A. Brokenshire,et al.  Introduction to the Cell Broadband Engine Architecture , 2007, IBM J. Res. Dev..

[6]  Shiyi Chen,et al.  LATTICE BOLTZMANN METHOD FOR FLUID FLOWS , 2001 .

[7]  David A. Bader,et al.  FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine , 2007, HiPC.

[8]  J. Dongarra,et al.  The Impact of Multicore on Computational Science Software , 2007 .

[9]  Ignacio Pagonabarraga,et al.  LUDWIG: A parallel Lattice-Boltzmann code for complex fluids , 2001 .

[10]  Nigel P. Topham,et al.  Performance of the decoupled ACRI-1 architecture: the perfect club , 1995, HPCN Europe.

[11]  Ashish Sharma,et al.  De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers , 2008, Int. J. High Perform. Comput. Appl..

[12]  Peter V Coveney,et al.  Large–scale grid–enabled lattice Boltzmann simulations of complex fluid flow in porous media and under shear , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[13]  J. Dongarra,et al.  SCOP3: A Rough Guide to Scientific Computing On the PlayStation 3 , 2007 .

[14]  Sauro Succi,et al.  Massively Parallel Lattice-Boltzmann Simulation of Turbulent Channel Flow , 1997 .

[15]  Cass T. Miller,et al.  A high-performance lattice Boltzmann implementation to model flow in porous media , 2003 .

[16]  A. Ladd,et al.  Lattice-Boltzmann Simulations of Particle-Fluid Suspensions , 2001 .

[17]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .