Formally optimal boxing

An important implementation decision in polymorphically typed functional programming language is whether to represent data in boxed or unboxed form and when to transform them from one representation to the other. Using a language with explicit representation types and boxing/unboxing operations we axiomatize equationally the set of all explicitly boxed versions, called completions, of a given source program. In a two-stage process we give some of the equations a rewriting interpretation that captures eliminating boxing/unboxing operations without relying on a specific implementation or even semantics of the underlying language. The resulting reduction systems operate on congruence classes of completions defined by the remaining equations E, which can be understood as moving boxing/unboxing operations along data flow paths in the source program. We call a completion eopt formally optimal if every other completion for the same program (and at the same representation type) reduces to eopt under this two-stage reduction. We show that every source program has formally optimal completions, which are unique modulo E. This is accomplished by first “polarizing” the equations in E and orienting them to obtain two canonical (confluent and strongly normalizing) rewriting systems. The completions produced by Leroy's and Poulsen's algorithms are generally not formally optimal in our sense. The rewriting systems have been implemented and applied to some simple Standard ML programs. Our results show that the amount of boxing and unboxing operations is also in practice substantially reduced in comparison to Leroy's completions. This analysis is intended to be integrated into Tofte's region-based implementation of Standard ML currently underway at DIKU.

[1]  Fritz Henglein,et al.  Dynamic Typing: Syntax and Proof Theory , 1994, Sci. Comput. Program..

[2]  Xavier Leroy,et al.  Unboxed objects and polymorphic typing , 1992, POPL '92.

[3]  Guy L. Steele,et al.  Fast arithmetic in MacLISP , 1977 .

[4]  Robert Cartwright,et al.  Soft typing , 1991, PLDI '91.

[5]  Giorgio Ghelli,et al.  Coherence of Subsumption , 1990, CAAP.

[6]  Val Tannen,et al.  Computing with coercions , 1990, LISP and Functional Programming.

[7]  Mads Tofte,et al.  Implementation of the typed call-by-value λ-calculus using a stack of regions , 1994, POPL '94.

[8]  John Launchbury,et al.  Unboxed values as first class citizens , 1991 .

[9]  Robin Milner,et al.  Principal type-schemes for functional programs , 1982, POPL '82.

[10]  John Peterson Untagged data in tagged environments: choosing optimal representations at compile time , 1989, FPCA.

[11]  Philip Wadler,et al.  Theorems for free! , 1989, FPCA.

[12]  John C. Reynolds,et al.  Types, Abstraction and Parametric Polymorphism , 1983, IFIP Congress.

[13]  Rodney A. Brooks,et al.  An optimizing compiler for lexically scoped LISP , 1982, SIGPLAN '82.

[14]  Thierry Coquand,et al.  Inheritance as Implicit Coercion , 1991, Inf. Comput..

[15]  Paul Hudak,et al.  ORBIT: an optimizing compiler for scheme , 1986, SIGPLAN '86.

[16]  Simon L. Peyton Jones,et al.  Unboxed Values as First Class Citizens in a Non-Strict Functional Language , 1991, FPCA.

[17]  Nikolaj Skallerud Bjjrner,et al.  Minimal Typing Derivations , 1994 .

[18]  John C. Mitchell,et al.  On the type structure of standard ML , 1993, TOPL.

[19]  Robin Milner,et al.  A Theory of Type Polymorphism in Programming , 1978, J. Comput. Syst. Sci..