CACHE MISS ANALYSIS OF 2D STENCIL CODES WITH TILED TIME LOOP
暂无分享,去创建一个
Stencil codes such as the Jacobi, Gaus-Seidel, and red-black Gaus-Seidel kernels are among the most time-consuming routines in many scientific and engineering applications. The performance of these codes critically depends on an efficient usage of caches, and can be improved by tiling. Several tiling schemes have been suggested in the literature; this paper gives an overview and comparison. Then, in the main part, we prove a lower bound on the number of cold and capacity misses. Finally, we analyze a particular tiling scheme, and show that it is off the lower bound by a factor of at most ten. Our results show up limitations to the speedup that can be gained by future research.
[1] ToledoSivan,et al. Efficient Out-of-Core Algorithms for Linear Relaxation Using Blocking Covers , 1997 .
[2] Alok Aggarwal,et al. The input/output complexity of sorting and related problems , 1988, CACM.
[3] Jeffrey Scott Vitter. External memory algorithms , 1998, PODS '98.