The performance evaluation has been carried out on top of the Multi2Sim 2.2 simulation framework [2], a cycle-accurate simulator for x86-based superscalar processors, extended to model a clustered architecture with support for independent subtraces generation. The parameters of the modeled machine are summarized in Table 1. The Mediabench suite has been used to stress the machine, and simulations are stopped after the first 100 million uops commit. The steering algorithm and the interconnection network among clusters are important design factors related with the criticality of the inter-cluster communication latency. For a good baseline performance, the modeled schemes use a sophisticated steering algorithm called topology-aware steering [3], and several interconnection networks with different realistic link delays are considered.
[1]
Ramon Canal,et al.
A cost-effective clustered architecture
,
1999,
1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).
[2]
José Duato,et al.
Efficient interconnects for clustered microarchitectures
,
2002,
Proceedings.International Conference on Parallel Architectures and Compilation Techniques.
[3]
Pedro López,et al.
Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors
,
2007,
19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).
[4]
References
,
1971
.