Lattice QCD Applications on QPACE

Abstract QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making effcient use of these accelerator cores in scientific applications is challenging. In this paper we describe our strategies for porting applications to the QPACE architecture and report on performance numbers.

[1]  Martin Luscher Solution of the Dirac equation in lattice QCD using a domain decomposition method , 2003 .

[2]  Raffaele Tripiccione,et al.  Computing for LQCD: apeNEXT , 2006, Computing in Science & Engineering.

[3]  Bálint Joó,et al.  Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Dennis Gannon,et al.  Active Libraries: Rethinking the roles of compilers and libraries , 1998, ArXiv.

[5]  Cleve B. Moler,et al.  Iterative Refinement in Floating Point , 1967, JACM.

[6]  Claude Gomez,et al.  QPACE - a QCD parallel computer based on Cell processors , 2009, ArXiv.

[7]  Phillip A. Laplante Performance Analysis and Optimization , 2004 .

[8]  N. Eicker,et al.  QCD on the Cell Broadband Engine , 2007 .

[9]  Sebastiano Fabio Schifano,et al.  An FPGA-based Torus Communication Network , 2011, ArXiv.

[10]  L. Biferale,et al.  High resolution numerical study of Rayleigh–Taylor turbulence using a thermal lattice Boltzmann scheme , 2010, 1009.5483.

[11]  Alan Gara,et al.  Overview of the QCDSP and QCDOC computers , 2005, IBM J. Res. Dev..

[12]  Ronald B. Morgan,et al.  GMRES WITH DEFLATED , 2008 .

[13]  Vipin Kumar,et al.  Parallel depth first search. Part II. Analysis , 1987, International Journal of Parallel Programming.

[14]  Claude Gomez,et al.  QPACE: power-efficient parallel architecture based on IBM PowerXCell 8i , 2010, Computer Science - Research and Development.

[15]  F. Winter Investigation of hadron matter using lattice QCD and implementation of lattice QCD applications on heterogeneous multicore acceleration processors , 2012 .

[16]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[17]  Robert G. Edwards,et al.  The Chroma Software System for Lattice QCD , 2004 .

[18]  Yoshifumi Nakamura,et al.  BQCD -- Berlin quantum chromodynamics program , 2010, 1011.0199.