Performance comparison of data prefetching for pointer-chasing applications

Data prefetching is a well know approach to reduce memory latency and to improve performance, and has been explored in different applications. Chip Multiprocessor (CMP) now presents new opportunities to data prefetching. However, for pointer-chasing applications with irregular memory access patterns, the prefetching tends to achieve little overall performance gains. In this paper, we compare and analyze the performance of conventional prefetching thread and prefetch instruction by an example and six selected benchmarks from Olden benchmark suite. The experimental results show that prefetch instruction achieves better performance in most cases. In addition, it is observed that the prefetching thread can eliminate more L2 read misses than prefetch instruction on general.

[1]  Martin Burtscher,et al.  Future execution: a hardware prefetching technique for chip multiprocessors , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[2]  Kazuaki Murakami,et al.  Analyzing the impact of data prefetching on Chip MultiProcessors , 2008, 2008 13th Asia-Pacific Computer Systems Architecture Conference.

[3]  Gurindar S. Sohi,et al.  A quantitative framework for automated pre-execution thread selection , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[4]  Surendra Byna,et al.  A Taxonomy of Data Prefetching Mechanisms , 2008, 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (i-span 2008).

[5]  Wei-Chung Hsu,et al.  COBRA: An Adaptive Runtime Binary Optimization Framework for Multithreaded Applications , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[6]  Paul Chow,et al.  Optimization of data prefetch helper threads with path-expression based statistical modeling , 2007, ICS '07.

[7]  Weifeng Zhang,et al.  A self-repairing prefetcher in an event-driven dynamic optimization framework , 2006, International Symposium on Code Generation and Optimization (CGO'06).