Performance analysis of Intel Core 2 Duo processor

With the emergence of thread level parallelism as a more efficient method of improving processor performance, Chip Multiprocessor (CMP) technology is being more widely used in developing processor architectures. Also, the widening gap between CPU and memory speed has evoked the interest of researchers to understand performance of memory hierarchical architectures. As part of this research, performance characteristic studies were carried out on the Intel Core 2 Duo, a dual core power efficient processor, using a variety of new generation benchmarks. This study provides a detailed analysis of the memory hierarchy performance and the performance scalability between single and dual core processors. The behavior of SPEC CPU2006 benchmarks running on Intel Core 2 Duo processor is also explained. Lastly, the overall execution time and throughput measurement using both multi-programmed and multi-threaded workloads for the Intel Core 2 Duo processor is reported and compared to that of the Intel Pentium D and AMD Athlon 64X2 processors. Results showed that the Intel Core 2 Duo had the best performance for a variety of workloads due to its advanced micro-architectural features such as the shared L2 cache, fast cache to cache communication and smart memory access.

[1]  Michael Zhang,et al.  Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors , 2005, ISCA 2005.

[2]  Kunle Olukotun,et al.  A Single-Chip Multiprocessor , 1997, Computer.

[3]  Carl Staelin lmbench: an extensible micro‐benchmark suite , 2005, Softw. Pract. Exp..

[4]  Ian Pratt,et al.  Multiprogramming Performance of the Pentium 4 with Hyper-Threading , 2004 .

[5]  Lizy K. John,et al.  Performance characterization of SPEC CPU benchmarks on intel's core microarchitecture based processor , 2007 .

[6]  Avi Mendelson,et al.  A performance analysis of Pentium processor systems , 1995, IEEE Micro.

[7]  G.S. Sohi Cooperative Caching for Chip Multiprocessors , 2006, ISCA 2006.

[8]  Dean M. Tullsen,et al.  Initial observations of the simultaneous multithreading Pentium 4 processor , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[9]  S. Kim,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[10]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[11]  Jan Treibig,et al.  Performance Analysis of the Lattice Boltzmann Method on x 86-64 Architectures , 2005 .

[12]  Lu Peng,et al.  Memory Performance and Scalability of Intel's and AMD's Dual-Core Processors: A Case Study , 2007, 2007 IEEE International Performance, Computing, and Communications Conference.

[13]  David A. Wood,et al.  Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[14]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[15]  Thorsten von Eicken,et al.  技術解説 IEEE Computer , 1999 .

[16]  Avi Mendelson,et al.  CMP Implementation in Systems Based on the Intel Core Duo Processor , 2006 .