Implementing Functional Languages with Fast Equality, Sets and Maps: an Exercise in Hash Consing

We investigate hash consing, a memory allocation strategy for functional languages. Though the idea is not new, its systematic use as a foundation for the run-time system of a language is new. We call this systematic approach maximal sharing. This strategy is shown to be implementable in practice with small speed and space penalties, while o ering great opportunities to save space and execution time in big projects. Besides, it paves the way towards more e cient implementations of very desirable data structures like sets and maps (set-theoretic functions of nite domain) [23], opening the door to a whole slew of setand map-based functional languages like POPS-Lisp [18] and HimML [20], a variant of Standard ML written by the author. The average-case complexities of operations on sets and maps are investigated, and shown to be quite good indeed. Computation sharing and incremental computation are brie y considered in this framework. Garbage collection techniques are reviewed to decide of their suitability in a maximal sharing environment. Stop-and-copy and its variants are almost impossible to implement; however, a good surprise awaits us, as mark-and-sweep becomes fast in a maximal sharing environment. Finally, benchmarks are done to get an idea of the cost of maximal sharing.

[1]  Andrew W. Appel,et al.  Garbage Collection can be Faster than Stack Allocation , 1987, Inf. Process. Lett..

[2]  Robert Fenichel,et al.  A LISP garbage-collector for virtual-memory computer systems , 1969, CACM.

[3]  William M. Waite,et al.  An efficient machine-independent procedure for garbage collection in various list structures , 1967, CACM.

[4]  Mícheál Mac an Airchinnigh Tutorial on the Irish School of the VDM , 1991, VDM Europe.

[5]  Robin Milner,et al.  Commentary on standard ML , 1990 .

[6]  Edmond Schonberg,et al.  Programming with Sets: An Introduction to SETL , 1986 .

[7]  Cliff B. Jones,et al.  Systematic software development using VDM , 1986, Prentice Hall International Series in Computer Science.

[8]  Andrew W. Appel,et al.  Compiling with Continuations , 1991 .

[9]  Henry Lieberman,et al.  A real-time garbage collector based on the lifetimes of objects , 1983, CACM.

[10]  Henry G. Baker,et al.  List processing in real time on a serial computer , 1978, CACM.

[11]  Frank Jackson,et al.  An adaptive tenuring policy for generation scavengers , 1992, TOPL.

[12]  D. A. Turner,et al.  Miranda: A Non-Strict Functional language with Polymorphic Types , 1985, FPCA.

[13]  William Pugh,et al.  Incremental computation via function caching , 1989, POPL '89.

[14]  John A. Allen,et al.  The anatomy of lisp , 1980 .

[15]  Leslie Lamport,et al.  On-the-fly garbage collection: an exercise in cooperation , 1975, CACM.

[16]  Edmund M. Clarke,et al.  Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[17]  John McCarthy,et al.  LISP 1.5 Programmer's Manual , 1962 .

[18]  Ralph E. Griswold,et al.  The implementation of the Icon programming language , 1986 .

[19]  J. A. Robinson,et al.  A Machine-Oriented Logic Based on the Resolution Principle , 1965, JACM.

[20]  Daniel G. Bobrow,et al.  An efficient, incremental, automatic garbage collector , 1976, CACM.

[21]  Marvin Minsky,et al.  A LISP Garbage Collector Algorithm Using Serial Secondary Storage , 1963 .

[22]  M.N. Sastry,et al.  Structure and interpretation of computer programs , 1986, Proceedings of the IEEE.

[23]  R. Bryant Graph-Based Algorithms for Boolean Function Manipulation12 , 1986 .

[24]  Warren Teitelman,et al.  The interlisp reference manual , 1974 .

[25]  Philippe Flajolet,et al.  Average-Case Analysis of Algorithms and Data Structures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[26]  Mark H. Overmars,et al.  The Design of Dynamic Data Structures , 1987, Lecture Notes in Computer Science.