Observation and analysis of the multicore performance impact on scientific applications

With the proliferation of large multicore high‐performance computing systems, application performance is often negatively affected. This paper provides benchmark results for a representative workload from the Department of Defense High‐performance Computing Modernization Program. The tests were run on a Cray XT‐3 and XT‐4, which use dual‐ and quad‐core AMD Opteron microprocessors. We use a combination of synthetic kernel and application benchmarks to examine the cache performance, MPI task placement strategies and compiler optimizations. Our benchmarks show performance behavior similar to that reported in other studies and sites. Dual‐ and quad‐core tests show a run‐time performance penalty compared with single‐core runs on the same systems. We attribute this performance degradation to a combination of L1 to main memory contention and task placement within the application. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .

[2]  Sadaf R. Alam,et al.  Characterization of Scientific Workloads on Systems with Multi-Core Processors , 2006, 2006 IEEE International Symposium on Workload Characterization.

[3]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[4]  J. Dongarra,et al.  The Impact of Multicore on Computational Science Software , 2007 .

[5]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[6]  Dhabaleswar K. Panda,et al.  Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[7]  Keith D. Underwood,et al.  Initial performance evaluation of the Cray SeaStar interconnect , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).