Look ma, no hashing, and no arrays neither

It is generally assumed that hashing is essential to many algorithms related to efficient compilation; e.g., symbol table formation and maintenance, grammar manipulation, basic block optimization, and global optimization. This paper questions this assumption, and initiates development of an efficient alternative compiler methodology without hashing or sorting. Underlying this methodology are several generic algorithmic tools, among which special importance is given to Multiset Discrimination, which partitions a multiset into blocks of duplicate elements. We show how multiset discrimination, together with other tools, can be tailored to rid compilation of hashing without loss in asymptotic performance. Because of the simplicity of these tools, our results maybe of practical as well as theoretical interest. The various applications presented culminate with a new algorithm to solve iterated strength reduction folded with useless code elimination that runs in worst case asymptotic time and auxiliary space linear in the maximum text length of the initial and optimized programs.

[1]  Robert Paige,et al.  Symbolic Finite Differencing - Part I , 1990, ESOP.

[2]  John Cocke,et al.  Programming languages and their compilers , 1969 .

[3]  John Cocke,et al.  Programming languages and their compilers: Preliminary notes , 1969 .

[4]  Wuu Yang,et al.  A new algorithm for semantics-based program integration , 1990 .

[5]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[6]  Mark N. Wegman,et al.  Constant propagation with conditional branches , 1985, POPL.

[7]  Daniel J. Rosenkrantz,et al.  Compiler design theory , 1976 .

[8]  Robert E. Tarjan,et al.  A Linear Time Solution to the Single Function Coarsest Partition Problem , 1985, Theor. Comput. Sci..

[9]  Robert Paige,et al.  Real-time Simulation of a Set Machine on a Ram , 1989 .

[10]  Stephen Warshall,et al.  A Theorem on Boolean Matrices , 1962, JACM.

[11]  Wuu Yang,et al.  Detecting Program Components With Equivalent Behaviors , 1989 .

[12]  Robert E. Tarjan,et al.  Variations on the Common Subexpression Problem , 1980, J. ACM.

[13]  Ken Kennedy,et al.  An algorithm for reduction of operator strength , 1977, Commun. ACM.

[14]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[15]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[16]  Robert E. Tarjan,et al.  Three Partition Refinement Algorithms , 1987, SIAM J. Comput..

[17]  R. Paige Symbolic finite differencing, part I (invited lecture) , 1990 .

[18]  Ron Cytron,et al.  Code motion of control structures in high-level languages , 1986, POPL '86.

[19]  Eduardo Pelegri-Llopart,et al.  Rewrite systems, pattern matching, and code generation , 1988 .

[20]  John E. Hopcroft,et al.  An n log n algorithm for minimizing states in a finite automaton , 1971 .

[21]  Harry G. Mairson The program complexity of searching a table , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[22]  Bowen Alpern,et al.  Detecting equality of variables in programs , 1988, POPL '88.

[23]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..