Building high-resolution sky images using the Cell/B.E

The performance potential of the Cell/B.E., as well as its availability, have attracted a lot of attention from various high-performance computing (HPC) fields. While computation intensive kernels proved to be exceptionally well suited for running on the Cell, irregular data-intensive applications are usually considered as poor matches. In this paper, we present our complete solution for enabling such a data-intensive application to run efficiently on the Cell/B.E. processor. Specifically, we target radioastronomy data gridding and degridding, two resembling imaging filters based on convolutional resampling. Our solution is based on building a high-level application model, used to evaluate parallelization alternatives. Next, we choose the one with the best performance potential, and we gradually exploit this potential by applying platform-specific and application-specific optimizations. After several iterations, our target application shows a speed-up factor between 10 and 20 on a dual-Cell blade when compared with the original application running on a commodity machine. Given these results, and based on our empirical observations, we are able to pinpoint a set of ten guidelines for parallelizing similar applications on the Cell/B.E. Finally, we conclude the Cell/B.E. can provide high performance for data-intensive applications at the price of increased programming efforts and with a significant aid from aggressive application-specific optimizations.

[1]  K. Golap,et al.  W Projection: A New Algorithm for Wide Field Imaging with Radio Synthesis Arrays , 2005 .

[2]  K. I. Kellermann,et al.  Preliminary Specifications for the Square Kilometre Array , 2008 .

[3]  John D. Bunton,et al.  A Radio Astronomy Correlator Optimized for the Xilinx Virtex-4 SX FPGA , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[4]  Michael Gschwind Chip multiprocessing and the cell broadband engine , 2006, CF '06.

[5]  H. J. Sips,et al.  The Performance of Gridding / Degridding on the Cell / B . , 2008 .

[6]  Rob van Nieuwpoort,et al.  Radioastronomy Image Synthesis on the Cell/B.E , 2008, Euro-Par.

[7]  Chris Broekema,et al.  The Lofar Central Processing Facility Architecture , 2004 .

[8]  Samuel Williams,et al.  The potential of the cell processor for scientific computing , 2005, CF '06.

[9]  Tao Zhang,et al.  Supporting OpenMP on Cell , 2008, International Journal of Parallel Programming.

[10]  Fabrizio Petrini,et al.  Cell Multiprocessor Communication Network: Built for Speed , 2006, IEEE Micro.

[11]  Toshio Nakatani,et al.  MPI microtask for programming the Cell Broadband EngineTM processor , 2006, IBM Syst. J..

[12]  D Rosenfeld,et al.  An optimal and efficient new gridding algorithm using singular value decomposition , 1998, Magnetic resonance in medicine.

[13]  Weiguo Liu,et al.  Molecular Dynamics Simulations on Commodity GPUs with CUDA , 2007, HiPC.

[14]  I. Wald,et al.  Ray Tracing on the Cell Processor , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[15]  Jack Dongarra,et al.  SCOP3: A Rough Guide to Scientific Computing On the PlayStation 3 , 2007 .

[16]  Fabrizio Petrini,et al.  Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[17]  Frederic R. Schwab Optimal Gridding of Visibility Data in Radio Interferometry , 1984 .

[18]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[19]  T. J. Cornwell SKA and EVLA Computing Costs for Wide Field Imaging , 2004 .

[20]  Jan Timmer,et al.  The gridding method for image reconstruction by Fourier transformation , 1995, IEEE Trans. Medical Imaging.

[21]  Arjan J. C. van Gemund,et al.  Performance prediction of parallel processing systems: the PAMELA methodology , 1993, ICS '93.

[22]  Michael D. McCool Signal Processing and General-Purpose Computing on GPUs , 2007 .

[23]  Ana Lucia Varbanescu,et al.  Building high-resolution sky images using the Cell/B.E. , 2009, HiPC 2009.

[24]  H. Rottgering,et al.  The Westerbork Northern Sky Survey (WENSS) I. A 570 square degree Mini-Survey around the North Ecliptic Pole ? , 1997 .

[25]  David A. Bader,et al.  High performance combinatorial algorithm design on the Cell Broadband Engine processor , 2007, Parallel Comput..

[26]  Timothy J. Cornwell,et al.  Radio-interferometric imaging of very large fields. The problem of non-coplanar arrays. , 1992 .

[27]  Qiang Liu,et al.  Digital Media Indexing on the Cell Processor , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[28]  Benoit Couet,et al.  Optimal Gridding: A Fast Proxy for Large Reservoir Simulations , 2007 .