Optimal code generation for expression trees: an application BURS theory

A <italic>Rewrite System</italic> is a collection of <italic>rewrite rules</italic> of the form α β where α and β are tree patterns. A rewrite system can be extended by associating a cost with each rewrite rule, and by defining the cost of a rewrite sequence as the sum of the costs of all the rewrite rules in the sequence. The REACHABILITY problem for a rewrite system <italic>R</italic> is, given an input tree <italic>T</italic> and a fixed <italic>goal</italic> tree <italic>G</italic>, to determine if there exists a rewrite sequence in <italic>R</italic>, rewriting <italic>T</italic> into <italic>G</italic> and, if so, to obtain one such sequence. The C-REACHABILITY problem is similar except that the obtained sequence must have minimal cost among all those sequences writing <italic>T</italic> into <italic>G</italic>. This paper introduces a class of rewrite systems called Bottom-Up Rewrite Systems (BURS), and a table-driven algorithm to solve REACHABILITY for member of the class. This algorithm is then modified to solve C-REACHABILITY and specialized for a subclass of BURS so that all cost manipulation is encoded into the tables and is not performed explicitly at solving time. The subclass extends the <italic>simple machine grammars</italic> [AGH84], rewrite systems used to describe target machine architectures for code generation, by allowing additional types of rewrite rules such as commutativity transformations. A table-driven code generator based on solving C-REACHABILITY has been implemented and tested with several machine descriptions. The code generator solves C-REACHABILITY faster than a comparable solver based on Graham-Glanville techniques [AGH84] (a non-optimal technique), yet requires only slightly larger tables. The code generator runs much faster than recent proposals to solve C-REACHABILITY that use pattern matching and deal with costs explicitly at solving time [AGT86, HeD87, WeW86]. The BURS theory generalizes and unifies the bottom-up approaches of Henry/Damron [HeD87] and Weisgerber/Wilhelm [WeW86].