Loop staggering, loop compacting: Restructuring techniques for thrashing problem

Parallel loops account for the greatest amount of parallelism in numerical programs. Executing nested loops in parallel with low run-time overhead is thus very important for achieviog high performance in parallel processing systems. However, in parallel processing systems with caches or local memories in memory hierarchies, “thrashing problem” may arise when data move back and forth frequently between the caches or local memories in different processors. The techniques associated with parallel compiler to solve the problem are not completely developed. In this paper, we present two restructuring techniques called loop staggering, loop staggering and compacting, with which we can not only eliminate the cache or local memory, thrashing phenomena significantly, but also exploit the potential parallelism existing in outer serial loop. Loop staggering benefits the dynamic loop scheduling strategies, whereas loop staggering and compacting is good for static loop scheduling strategies. Our method especially benefits parallel programs, in which a parallel loop is enclosed by a serial loop and array elements are repeatedly used in the different iterations of the parallel loop.