神威太湖之光上OpenFOAM的移植与优化 (Porting and Optimizing OpenFOAM on Sunway TaihuLight System)

The Sunway TaihuLight supercomputer based on the Chinese-designed many-core processors is the world^s fastest system with a peak performance of 125. 4 PFlops. OpenFOAM (open source field operation and manipulation) is one of the most popular open source computational fluid dynamics (CFD) software which is written in C + + and not fully compatible with compilers on the heterogeneous many-core processor SW26010. This paper ported OpenFOAM based on SW 26010’ s MPE(management processing element)/CPE (computing processing element) cluster architec­ ture. To overcome the compilation incompatibility problem, we adopted the mixed-language application design. We also applied several SW26010^s feature-specific optimizations on the hotspot of OpenFOAM to deliver high performance, such as the register communication, vectorization, and double buffering. The experiments on SW26010 using real data­ sets show that the single-CG (core group) code runs 8. 03x faster than the well-tuned version on the MPE,and the per­ formance of single-CG is 1. 18x higher than the serial implementation of Intel(R) Xeon(R) CPU E5-2695 v3. We also optimized the parallel implementation of OpenFOAM and yielded speedups of 184. 9x on 256 CGs. The porting methods and optimizations presented can also be referenced for other complex C + + programs to achieve high performance on SW26010.