Exploiting Temporal Locality Using a Dependence Driven Execution

The order in which loop iterations are executed can have a large impact on the number of cache misses that an applications takes. A new loop order that preserves the semantics of the old order but has a better cache data re-uses, improves the performance of that application. Several compiler techniques exists to staticly transform loops such that the order of iterations reduces cache misses. This paper introduces a run-time method to determine the order based on a dependence-driven execution. In a dependence-driven execution, an execution traverses the iteration space by following the dependence arcs between the iterations.