论文信息 - Latency, bandwidth, and concurrent issue limitations in high-performance CFD.

Latency, bandwidth, and concurrent issue limitations in high-performance CFD.

To achieve high performance, a parallel algorithm needs to effectively utilize the memory subsystem and minimize the communication volume and the number of network transactions. These issues gain further importance on modern architectures, where the peak CPU performance is increasing much more rapidly than the memory or network performance. In this paper, we present some performance enhancing techniques that were employed on an unstructured mesh implicit solver. Our experimental results show that this solver adapts reasonably well to the high memory and network latencies.

[1] Zeki Demirbilek,et al. Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP , 2000, Int. J. High Perform. Comput. Appl..

[2] William Gropp,et al. Performance Modeling and Tuning of an Unstructured Mesh CFD Application , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[3] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .

[4] D. Mavriplis. Parallel unstructured mesh analysis of high-lift configurations , 2000 .

[5] David E. Keyes,et al. Towards Realistic Performance Bounds for Implicit CFD Codes , 2000 .

[6] William Gropp,et al. Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD , 2000, Int. J. High Perform. Comput. Appl..

[7] Anthony Skjellum,et al. Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.