When Do Match-compilation Heuristics Matter?

Modern, statically typed, functional languages define functions by pattern matching. Although pattern matching is defined in terms of sequential checking of a value against one pattern after another, real implementations translate patterns into automata that can test a value against many patterns at once. Decision trees are popular automata. The cost of using a decision tree is related to its size and shape. The only method guaranteed to produce decision trees of minimum cost requires exponential match-compilation time, so a number of heuristics have been proposed in the literature or used in actual compilers. This paper presents an experimental evaluation of such heuristics, using the Standard ML of New Jersey compiler. The principal finding is that for most benchmark programs, all heuristics produce trees with identical sizes. For a few programs, choosing one heuristic over another may change the size of a decision tree, but seldom by more than a few percent. There are, however, machine-generated programs for which the right or wrong heuristic can make enormous differences: factors of 2-20.

[1]  D. A. Turner,et al.  Miranda: A Non-Strict Functional language with Polymorphic Types , 1985, FPCA.

[2]  Rod M. Burstall,et al.  HOPE: An experimental applicative language , 1980, LISP Conference.

[3]  Laurence Puel,et al.  Compiling pattern matching by term decomposition , 1990, LISP and Functional Programming.

[4]  David B. MacQueen,et al.  The Definition of Standard ML (Revised) , 1997 .

[5]  Peter Sestoft,et al.  MK Pattern Match Compilation and Partial Evaluation , 1996, Dagstuhl Seminar on Partial Evaluation.

[6]  P. R. Bevington,et al.  Data Reduction and Error Analysis for the Physical Sciences , 1969 .

[7]  Luc Maranget,et al.  Two Techniques for Compiling Lazy Pattern Matching , 1994 .

[8]  Douglas Comer,et al.  Complexity of trie index construction , 1976, 17th Annual Symposium on Foundations of Computer Science (sfcs 1976).

[9]  Lennart Augustsson,et al.  Compiling Pattern Matching , 1985, FPCA.

[10]  Joyce L. Vedral,et al.  Functional Programming Languages and Computer Architecture , 1989, Lecture Notes in Computer Science.

[11]  Douglas Comer,et al.  The Complexity of Trie Index Construction , 1977, JACM.

[12]  Simon L. Peyton Jones,et al.  Report on the programming language Haskell: a non-strict, purely functional language version 1.2 , 1992, SIGP.

[13]  Christoph M. Hoffmann,et al.  Pattern Matching in Trees , 1982, JACM.

[14]  Alain Laville Comparison of Priority Rules in Pattern Matching and Term Rewriting , 1991, J. Symb. Comput..

[15]  Luca Cardelli,et al.  Compiling a functional language , 1984, LFP '84.

[16]  Luc Maranget,et al.  Compiling lazy pattern matching , 1992, LFP '92.

[17]  Norman Ramsey,et al.  The New Jersey Machine-Code Toolkit , 1995, USENIX.

[18]  Mariann e Baudine Tree Pattern Matching for ML ( extended abstract ) , 1985 .

[19]  Ralf Hinze,et al.  Haskell 98 — A Non−strict‚ Purely Functional Language , 1999 .

[20]  Norman Ramsey,et al.  Specifying representations of machine instructions , 1997, TOPL.

[21]  Simon L. Peyton Jones,et al.  The Implementation of Functional Programming Languages , 1987 .