Principled Optimization of Functional Programs

Automatic optimizers for computer programs work with a fixed list of rote transformations, while human programmers can go on to derive new optimizations from broad and intuitive principles if known transformations prove inadequate. This dissertation investigates the possibility that principled optimization can be automated, focusing on a single principle (the principle that programs should not do anything unnecessary) and a single program domain (the domain of purely functional, first-order programs). Three questions are explored: How can the principle be formalized? How can violations of the principle be detected? How can violations be repaired? The trace grammar is a new representation for first-order functional programs. It permits a simple formal statement of the optimizing principle. A trace grammar is a kind of graph grammar. An individual graph represents a path of execution through the program, without any loops or conditionals. The grammar for a program generates a language of graphs representing the possible paths of execution through that program. Trace grammars provide unique leverage for the twin problems of identifying and repairing violations of the principle. Detecting violations of the principle is a problem in semantic analysis. A new method of inference, the relational constraint method, helps make this tractable. The method treats a trace graph as a system of constraints on a lattice of binary relations, and uses those constraints to develop the strongest relation it can find for each pair of values in the graph. Repairing violations is not easy: given an example of an unnecessary computation performed by the program, one wants to modify the program so that it never makes that mistake. This grammar thinning problem for trace grammars corresponds to an interesting open problem on context-free grammars. An (approximate) solution to this CFG problem yields an optimization technique for trace grammars. An optimizer called Thinner is the proof-of-concept for these ideas. Using the techniques outlined above, Thinner rediscovers a variety of common compiler optimizations. It also finds other, more exotic transformations, including the well-known Fibonacci reformulation. Thinner demonstrates the potential of the principled approach as a high-powered optimizing tool.