Purely functional data structures

When a C programmer needs an efficient data structure for a particular problem, he or she can often simply look one up in any of a number of good textbooks or handbooks. Unfortunately, programmers in functional languages such as Standard ML or Haskell do not have this luxury. Although some data structures designed for imperative languages such as C can be quite easily adapted to a functional setting, most cannot, usually because they depend in crucial ways on assignments, which are disallowed, or at least discouraged, in functional languages. To address this imbalance, we describe several techniques for designing functional data structures, and numerous original data structures based on these techniques, including multiple variations of lists, queues, double-ended queues, and heaps, many supporting more exotic features such as random access or efficient catenation. In addition, we expose the fundamental role of lazy evaluation in amortized functional data structures. Traditional methods of amortization break down when old versions of a data structure, not just the most recent, are available for further processing. This property is known as persistence, and is taken for granted in functional languages. On the surface, persistence and amortization appear to be incompatible, but we show how lazy evaluation can be used to resolve this conflict, yielding amortized data structures that are efficient even when used persistently. Turning this relationship between lazy evaluation and amortization around, the notion of amortization also provides the first practical techniques for analyzing the time requirements of non-trivial lazy programs. Finally, our data structures offer numerous hints to programming language designers, illustrating the utility of combining strict and lazy evaluation in a single language, and providing non-trivial examples using polymorphic recursion and higher-order, recursive modules.

[1]  M. AdelsonVelskii,et al.  AN ALGORITHM FOR THE ORGANIZATION OF INFORMATION , 1963 .

[2]  E. Lohse,et al.  A Correspondence Between ALGOL 60 and Church's Lambda- Notation: Part I* , 1965 .

[3]  P. J. Landin,et al.  Correspondence between ALGOL 60 and Church's Lambda-notation , 1965, Commun. ACM.

[4]  DONALD MICHIE,et al.  “Memo” Functions and Machine Learning , 1968, Nature.

[5]  Arnold L. Rosenberg,et al.  Real-Time Simulation of Multihead Tape Units , 1972, JACM.

[6]  Clark A. Crane,et al.  Linear Lists and Prorty Queues as Balanced Binary Trees , 1972, Outstanding Dissertations in the Computer Sciences.

[7]  Jean Vuillemin,et al.  Correct and optimal implementations of recursion in a simple programming language , 1973, J. Comput. Syst. Sci..

[8]  Jeffrey D. Ullman,et al.  Set Merging Algorithms , 1973, SIAM J. Comput..

[9]  Daniel P. Friedman,et al.  CONS Should Not Evaluate its Arguments , 1976, ICALP.

[10]  Peter Henderson,et al.  A lazy evaluator , 1976, POPL.

[11]  Joel I. Seiferas,et al.  New Real-Time Simulations of Multihead Tape Units , 1977, JACM.

[12]  Leonidas J. Guibas,et al.  A new representation for linear lists , 1977, STOC '77.

[13]  John W. Backus,et al.  Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs , 1978, CACM.

[14]  Mark R. Brown,et al.  Implementation and Analysis of Binomial Queue Algorithms , 1978, SIAM J. Comput..

[15]  Jean Vuillemin,et al.  A data structure for manipulating priority queues , 1978, CACM.

[16]  Leonidas J. Guibas,et al.  A dichromatic framework for balanced trees , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[17]  Joel I. Seiferas,et al.  New Real-Time Simulations of Multihead Tape Units , 1981, J. ACM.

[18]  Robert HOOD,et al.  Real-Time Queue Operation in Pure LISP , 1980, Inf. Process. Lett..

[19]  David Gries,et al.  The Science of Programming , 1981, Text and Monographs in Computer Science.

[20]  F. Warren Burton,et al.  An Efficient Functional Implementation of FIFO Queues , 1982, Information Processing Letters.

[21]  Paul F. Dietz Maintaining order in a linked list , 1982, STOC '82.

[22]  Robert Todd Hood,et al.  The Efficient Implementation of Very-high-level Programming Language Constructs , 1982 .

[23]  Eugene W. Myers,et al.  An Applicative Random-Access Stack , 1983, Inf. Process. Lett..

[24]  Mark H. Overmars,et al.  The Design of Dynamic Data Structures , 1987, Lecture Notes in Computer Science.

[25]  T. W. Butler Computer response time and user performance. , 1983, CHI '83.

[26]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[27]  Alan Mycroft,et al.  Polymorphic Type Schemes and Recursive Definitions , 1984, Symposium on Programming.

[28]  Jan van Leeuwen,et al.  Worst-case Analysis of Set Union Algorithms , 1984, JACM.

[29]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[30]  Eugene W. Myers,et al.  Efficient applicative data types , 1984, POPL.

[31]  John Hughes,et al.  Lazy Memo-functions , 1985, FPCA.

[32]  Norbert Blum On the Single-Operation Worst-Case Time Complexity on the Disjoint Set Union Problem , 1985, STACS.

[33]  R. Tarjan Amortized Computational Complexity , 1985 .

[34]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[35]  Athanasios K. Tsakalidis,et al.  AVL-Trees for Localized Search , 1984, ICALP.

[36]  John Hughes,et al.  A Novel Representation of Lists and its Application to the Function "reverse" , 1986, Inf. Process. Lett..

[37]  Robert E. Tarjan,et al.  Planar point location using persistent search trees , 1986, CACM.

[38]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[39]  Robert E. Tarjan,et al.  Deques with Heap Order , 1986, Inf. Process. Lett..

[40]  Robert E. Tarjan,et al.  Self-Adjusting Heaps , 1986, SIAM J. Comput..

[41]  Leon Sterling,et al.  The Art of Prolog - Advanced Programming Techniques , 1986 .

[42]  Douglas W. Jones,et al.  An empirical comparison of priority-queue and event-set implementations , 1986, CACM.

[43]  John T. Stasko,et al.  Pairing heaps: experiments and analysis , 1987, CACM.

[44]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1987, JACM.

[45]  Philip Wadler,et al.  Views: a way for pattern matching to cohabit with data abstraction , 1987, POPL '87.

[46]  Paul F. Dietz,et al.  Two algorithms for maintaining order in a list , 1987, STOC.

[47]  J.A. Stankovic,et al.  Misconceptions about real-time computing: a serious problem for next-generation systems , 1988, Computer.

[48]  J. Ian Munro,et al.  An Implicit Binomial Queue with Constant Insertion Time , 1988, SWAT.

[49]  Philip Wadler,et al.  Strictness analysis aids time analysis , 1988, POPL '88.

[50]  Robert E. Tarjan,et al.  Relaxed heaps: an alternative to Fibonacci heaps with applications to parallel computation , 1988, CACM.

[51]  Mark B. Josephs,et al.  The Semantics of Lazy Functional Languages , 1989, Theor. Comput. Sci..

[52]  John Hughes,et al.  Why Functional Programming Matters , 1989, Comput. J..

[53]  S. Holmström,et al.  A composition approach to time analysis of first order lazy functional programs , 1989, FPCA.

[54]  David Sands,et al.  Complexity Analysis for a Lazy Higher-Order Language , 1989, Functional Programming.

[55]  Robin Milner,et al.  Definition of standard ML , 1990 .

[56]  Jörg-Rüdiger Sack,et al.  A Characterization of Heaps and Its Applications , 1990, Inf. Comput..

[57]  Rajeev Raman,et al.  Persistence, amortization and randomization , 1991, SODA '91.

[58]  Arne Andersson A note on searching in a binary search tree , 1991, Softw. Pract. Exp..

[59]  Lawrence C. Paulson,et al.  ML for the working programmer , 1991 .

[60]  Robert E. Tarjan,et al.  Fully persistent lists with catenation , 1991, SODA '91.

[61]  Lam Berry Schoenmakers,et al.  Data structures and amortized complexity in a functional setting , 1992 .

[62]  Richard E. Jones,et al.  Tail recursion without space leaks , 1992, Journal of Functional Programming.

[63]  Rob R. Hoogerwoord,et al.  A Logarithmic Implementation of Flexible Arrays , 1992, MPC.

[64]  Rob R. Hoogerwoord,et al.  Functional Pearls A symmetric set of efficient list operations , 1992, Journal of Functional Programming.

[65]  Chris Reade Balanced Trees with Removals: An Exercise in Rewriting and Proof , 1992, Sci. Comput. Program..

[66]  Zvi Galil,et al.  On pointers versus addresses , 1992, JACM.

[67]  Jerzy Tiuryn,et al.  Type reconstruction in the presence of polymorphic recursion , 1993, TOPL.

[68]  C. M. Khoong,et al.  Double-Ended Binomial Queues , 1993, ISAAC.

[69]  John Launchbury,et al.  A natural semantics for lazy evaluation , 1993, POPL '93.

[70]  Stephen Adams,et al.  Functional Pearls Efficient sets—a balancing act , 1993, Journal of Functional Programming.

[71]  Mark P. Jones A system of constructor classes: overloading and implicit higher-order polymorphism , 1993, FPCA '93.

[72]  Robert E. Tarjan,et al.  Confluently persistent deques via data structuaral bootstrapping , 1993, SODA '93.

[73]  Fritz Henglein,et al.  Type inference with polymorphic recursion , 1993, TOPL.

[74]  Berry Schoenmakers A Systematic Analysis of Splaying , 1993, Inf. Process. Lett..

[75]  Adam Louis Buchsbaum,et al.  Data-structural bootstrapping and catenable deques , 1993 .

[76]  Benjamin Goldberg,et al.  Real-time deques, multihead Turing machines, and purely functional programming , 1993, FPCA '93.

[77]  F. Warren Burton,et al.  Pattern matching with abstract data types , 1993, Journal of Functional Programming.

[78]  Rajeev Raman,et al.  Eliminating Amortization: On Data Structures with Guaranteed Response Time , 1993 .

[79]  Rajeev Raman,et al.  Persistence, Randomization and Parallelization: On Some Combinatorial Games and their Applications (Abstract) , 1993, WADS.

[80]  Jeffrey D. Ullman Elements of ML programming , 1994 .

[81]  Paul Hudak,et al.  An Experiment in Software Prototyping Productivity , 1994 .

[82]  Peter Lee,et al.  Call-by-need and continuation-passing style , 1994, LISP Symb. Comput..

[83]  Mads Tofte,et al.  A Semantics for Higher-Order Functors , 1994, ESOP.

[84]  Gerth Stølting Brodal Fast Meldable Priority Queues , 1995 .

[85]  David Sands,et al.  A Naïve Time Analysis and its Theory of Cost Equivalence , 1995, J. Log. Comput..

[86]  Chris Okasaki,et al.  Simple and efficient purely functional queues and deques , 1995, Journal of Functional Programming.

[87]  Gerth Stølting Brodal,et al.  Fast Meldable Priority Queues , 1995, WADS.

[88]  Chris Okasaki,et al.  Purely functional random-access lists , 1995, FPCA '95.

[89]  Robert E. Tarjan,et al.  Data-Structural Bootstrapping, Linear Path Compression, and Catenable Heap-Ordered Double-Ended Queues , 1995, SIAM J. Comput..

[90]  Chris Okasaki,et al.  Amortization, lazy evaluation, and persistence: lists with catenation via lazy linking , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[91]  Haim Kaplan,et al.  Persistent lists with catenation via recursive slow-down , 1995, STOC '95.

[92]  Matthias Felleisen,et al.  A call-by-need lambda calculus , 1995, POPL '95.

[93]  Chris Okasaki,et al.  Functional Data Structures , 1996, Handbook of Data Structures and Applications.

[94]  Chris Okasaki,et al.  The role of lazy evaluation in amortized data structures , 1996, ICFP '96.

[95]  Haim Kaplan,et al.  Purely functional representations of catenable sorted lists , 1996, STOC '96.

[96]  Gerth Stølting Brodal,et al.  Optimal purely functional priority queues , 1996, Journal of Functional Programming.

[97]  Alistair Moffat,et al.  Splaysort: Fast, Versatile, Practical , 1996, Softw. Pract. Exp..

[98]  Ricardo Peña-Marí,et al.  A new look at pattern matching in abstract data types , 1996, ICFP '96.

[99]  Anne Kaldewaij,et al.  Leaf Trees , 1996, Science of Computer Programming.

[100]  Rolf Fagerberg,et al.  A Generalization of Binomial Queues , 1996, Inf. Process. Lett..

[101]  John Tang Boyland,et al.  Statically checkable pattern abstractions , 1997, ICFP '97.

[102]  Chris Okasaki,et al.  Catenable double-ended queues , 1997, ICFP '97.

[103]  Nicholas Pippenger,et al.  Pure versus impure Lisp , 1997, TOPL.

[104]  Sartaj Sahni,et al.  Weight-biased leftist trees and modified skip lists , 1998, JEAL.

[105]  Ralf Hinze,et al.  Haskell 98 — A Non−strict‚ Purely Functional Language , 1999 .