Complex version of high performance computing LINPACK benchmark (HPL)

This paper describes our effort to enhance the performance of the AORSA fusion energy simulation program through the use of high‐performance LINPACK (HPL) benchmark, commonly used in ranking the top 500 supercomputers. The algorithm used by HPL, enhanced by a set of tuning options, is more effective than that found in the ScaLAPACK library. Retrofitting these algorithms, such as look‐ahead processing of pivot elements, into ScaLAPACK is considered as a major undertaking. Moreover, HPL is configured as a benchmark, but only for real‐valued coefficients. We therefore developed software to convert HPL for use within an application program that generates complex coefficient linear systems. Although HPL is not normally perceived as a part of an application, our results show that the modified HPL software brings a significant increase in the performance of the solver when simulating the highest resolution experiments thus far configured, achieving 87.5 TFLOPS on over 20 000 processors on the Cray XT4. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  Greg Lindstrom,et al.  Programming with Python , 2005, IT Professional.

[2]  Alfred V. Aho,et al.  The awk programming language , 1988 .

[3]  Eduardo F. D'Azevedo,et al.  Advances in full-wave modeling of radio frequency heated, multidimensional plasmas , 2002 .

[4]  R. Aymar,et al.  Overview of ITER-FEAT - The future international burning plasma experiment , 2001 .

[5]  William Gropp,et al.  Mpi---the complete reference: volume 1 , 1998 .

[6]  Jack Dongarra,et al.  Numerical Linear Algebra for High-Performance Computers , 1998 .

[7]  Sadaf R. Alam,et al.  The Cray XT4 Quad-core : A First Look , 2008 .

[8]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[9]  Jack J. Dongarra,et al.  An extended set of FORTRAN basic linear algebra subprograms , 1988, TOMS.

[10]  Jaeyoung Choi,et al.  A Proposal for a Set of Parallel Basic Linear Algebra Subprograms , 1995, PARA.

[11]  Jack J. Dongarra,et al.  The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[12]  E D'Azevedo,et al.  Sheared poloidal flow driven by mode conversion in tokamak plasmas. , 2003, Physical review letters.

[13]  James Demmel,et al.  Prospectus for the Next LAPACK and ScaLAPACK Libraries , 2006, PARA.

[14]  Robert A. van de Geijn,et al.  Anatomy of high-performance matrix multiplication , 2008, TOMS.

[15]  Julien Langou,et al.  Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems , 2007, Int. J. High Perform. Comput. Appl..

[16]  James Demmel,et al.  ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[17]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a FORTRAN environment , 1988, CARN.

[18]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[19]  Larry Wall,et al.  Programming Perl , 1991 .

[20]  Jack J. Dongarra,et al.  A set of level 3 basic linear algebra subprograms , 1990, TOMS.

[21]  P. Strazdins A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization , 1998 .

[22]  Jack Dongarra,et al.  LAPACK Working Note 18: Implementation Guide for LAPACK , 1990 .

[23]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[24]  Ami Marowka,et al.  Parallel Scientific Computation: A Structured Approach using BSP and MPI , 2006, Scalable Comput. Pract. Exp..

[25]  David Smithe,et al.  Global-wave solutions with self-consistent velocity distributions in ion cyclotron heated plasmas , 2006 .