Cache performance of combinator graph reduction

The threaded Interpretive Graph Reduction Engine (TIGRE) was developed for the efficient reduction of combinator graphs in support of functional programming languages and other applications. Results are presented of cache simulations of the TIGRE graph reducer with the following parameters varied: cache size, cache organization, block size, associativity, replacement policy, write policy, and write allocation. As a check on these results, the simulations are compared to measured performance on real hardware. From the results of the simulation study, it is concluded that graph reduction in TIGRE has a very heavy dependence on a write-allocate strategy for good performance, and very high spatial and temporal locality.<<ETX>>