This work details a performance study of six different types of commodity memories in two commodity server nodes. A number of micro-benchmarks are used that measure low-level performance characteristics, as well as two applications representative of the ASC workload. The memories vary both in terms of performance, including latency and bandwidths, and in terms of their physical properties and manufacturer. The two server nodes analyzed were an Itanium-II Madison based system, and a Xeon based system. All memories can be used within both of these processing nodes. This allows the performance of the memories to be directly examined while keeping all other factors within a node the same (processor, motherboard, operating system etc.). The results of this study show that there can be a significant difference in application performance depending on the actual memory used - by as much as 20%. The achieved performance is a result of the integration of the memory into the node as well as how the applications actually utilize it.
[1]
Fabrizio Petrini,et al.
Predictive Performance and Scalability Modeling of a Large-Scale Application
,
2001,
ACM/IEEE SC 2001 Conference (SC'01).
[2]
Adolfy Hoisie,et al.
Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications
,
2000,
Int. J. High Perform. Comput. Appl..
[3]
Yong Luo,et al.
An empirical hierarchical memory model based on hardware performance counters
,
1998
.
[4]
Adolfy Hoisie,et al.
A comparison between the Earth Simulator and AlphaServer systems using predictive application performance models
,
2003,
Proceedings International Parallel and Distributed Processing Symposium.