Performance Analysis of Prefetching Thread for Linked Data Structure in CMPs

Chip Multiprocessor (CMP) presents new opportunities to data prefetching. Prefetching thread is a well known approach to reduce memory latency and to improve performance, and has been explored in different applications. However, for applications with linked data structure(LDS), prefetching thread tends to achieve little overall performance gains. In this paper, we analyze the performance of conventional prefetching thread by an example and five selected benchmarks from Olden benchmark suite. The experimental results show that it gets best performance when computation/access latency ratio is close to 1. In addition, we propose a theorem with its proof and testify it by our experiment results. Keywords-CMP; Prefetching Thread; Computation/Access Latency Ratio; Performance Analysis

[1]  Wei-Chung Hsu,et al.  COBRA: An Adaptive Runtime Binary Optimization Framework for Multithreaded Applications , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[2]  Magnus Jahre,et al.  Low-cost open-page prefetch scheduling in chip multiprocessors , 2008, 2008 IEEE International Conference on Computer Design.

[3]  Martin Burtscher,et al.  Future execution: a hardware prefetching technique for chip multiprocessors , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[4]  Kazuaki Murakami,et al.  Analyzing the impact of data prefetching on Chip MultiProcessors , 2008, 2008 13th Asia-Pacific Computer Systems Architecture Conference.

[5]  Paul Chow,et al.  Optimization of data prefetch helper threads with path-expression based statistical modeling , 2007, ICS '07.

[6]  Weifeng Zhang,et al.  A self-repairing prefetcher in an event-driven dynamic optimization framework , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[7]  Gurindar S. Sohi,et al.  A quantitative framework for automated pre-execution thread selection , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[8]  Yonghong Song,et al.  Design and implementation of a compiler framework for helper threading on multi-core processors , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).