The design and implementation of the self compiler, an optimizing compiler for object-oriented programming languages

Object-oriented programming languages promise improved programmer productivity through abstract data types, inheritance, and message passing. Unfortunately, traditional implementations of object-oriented language features are much slower than traditional implementations of their non-object-oriented counterparts. This dissertation describes a collection of implementation techniques that can improve the run-time performance of object-oriented languages. The new techniques identify those messages whose receiver can only be of a single representation and eliminate the overhead of message passing by replacing the message with a normal direct procedure call; these direct procedure calls are then amenable to traditional inline-expansion. Type analysis extracts representation-level type information about the receivers of messages. Customization transforms a single source method into several compiled versions, each version specific to a particular inheriting receiver type; customization allows all messages to self to be inlined away (or at least replaced with direct procedure calls). To avoid generating too much compiled code, the compiler is invoked at run-time, generating customized versions only for those method/receiver type pairs used by a particular program. Splitting transforms a single path through a source method into multiple separate fragments of compiled code, each fragment specific to a particular combination of run-time types. Messages to expressions of these discriminated types can then be optimized away in the split versions. These techniques have been implemented as part of the compiler for the SELF language, a purely object-oriented language designed as a refinement of Smalltalk-80. If only pre-existing implementation technology were used for SELF, programs in SELF would run one to two orders of magnitude slower than their counterparts written in a traditional non-object-oriented language. However, by applying the techniques described in this dissertation, the performance of the SELF system is five times better than the fastest Smalltalk-80 system, better than that of an optimizing Scheme implementation, and close to half that of an optimizing C implementation. These techniques could be applied to other object-oriented languages to boost their performance or enable a more object-oriented programming style.

[1]  David Robson,et al.  Smalltalk-80: The Language and Its Implementation , 1983 .

[2]  Robert Hieb,et al.  Representing control in the presence of first-class continuations , 1990, PLDI '90.

[3]  Craig Chambers,et al.  Debugging optimized code with dynamic deoptimization , 1992, PLDI '92.

[4]  Ralph E. Griswold,et al.  The Icon programming language , 1983 .

[5]  Ifor Williams,et al.  The design and evaluation of a high-performance smalltalk system , 1988 .

[6]  Norman C. Hutchinson,et al.  EMERALD: An object-based language for distributed programming , 1987 .

[7]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[8]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[9]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[10]  Simon L. Peyton Jones,et al.  The Implementation of Functional Programming Languages , 1987 .

[11]  Robert Scheifler,et al.  An analysis of inline substitution for a structured programming language , 1977, CACM.

[12]  John L. Hennessy,et al.  Register allocation by priority-based coloring , 1984, SIGPLAN '84.

[13]  J. W. Backus,et al.  The FORTRAN automatic coding system , 1899, IRE-AIEE-ACM '57 (Western).

[14]  Bowen Alpern,et al.  Detecting equality of variables in programs , 1988, POPL '88.

[15]  Guy L. Steele,et al.  LAMBDA: The Ultimate Declarative , 1976 .

[16]  John L. Hennessy,et al.  Symbolic Debugging of Optimized Code , 1982, TOPL.

[17]  Robin Milner,et al.  Definition of standard ML , 1990 .

[18]  Justin O. Graver,et al.  Type checking and type inference for object-oriented programming languages , 1989 .

[19]  Glenn Krasner,et al.  Smalltalk-80: bits of history, words of advice , 1983 .

[20]  Craig Schaffert,et al.  An introduction to Trellis/Owl , 1986, OOPLSA '86.

[21]  Craig Chambers,et al.  An efficient implementation of SELF, a dynamically-typed object-oriented language based on prototypes , 1989, OOPSLA '89.

[22]  Rajiv Gupta A fresh look at optimizing array bound checking , 1990, PLDI '90.

[23]  Corporate SPARC architecture manual - version 8 , 1992 .

[24]  Leonard Gilman,et al.  APL, an interactive approach , 1974 .

[25]  Stephen Richardson,et al.  Interprocedural analysis vs. procedure integration , 1989, Inf. Process. Lett..

[26]  John R. Rose,et al.  Fast dispatch mechanisms for stock hardware , 1988, OOPSLA '88.

[27]  Stephen N. Zilles,et al.  Programming with abstract data types , 1974, SIGPLAN Symposium on Very High Level Languages.

[28]  簡聰富,et al.  物件導向軟體之架構(Object-Oriented Software Construction)探討 , 1989 .

[29]  Bertrand Meyer,et al.  Genericity versus inheritance , 1986, OOPLSA '86.

[30]  Randall B. Smith,et al.  SELF: The power of simplicity , 1987, OOPSLA '87.

[31]  M. Wegman General and efficient methods for global code improvement , 1981 .

[32]  William Pugh,et al.  Two-directional record layout for multiple inheritance , 1990, PLDI '90.

[33]  Paul Schweizer,et al.  A fast method dispatcher for compiled languages with multiple inheritance , 1989, OOPSLA '89.

[34]  Andrew P. Black,et al.  Object structure in the Emerald system , 1986, OOPLSA '86.

[35]  Gerald Jay Sussman,et al.  Lambda: The Ultimate Imperative , 1976 .

[36]  Olin Shivers,et al.  Data-flow analysis and type recovery in Scheme , 1990 .

[37]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[38]  John L. Hennessy,et al.  The priority-based coloring approach to register allocation , 1990, TOPL.

[39]  Ralph E. Johnson,et al.  A type system for Smalltalk , 1989, POPL '90.

[40]  Peter Sestoft,et al.  A bibliography on partial evaluation , 1988, SIGP.

[41]  Daniel H. H. Ingalls,et al.  The Smalltalk-76 programming system design and implementation , 1978, POPL.

[42]  Åke Wikström,et al.  Functional programming using standard ML , 1987, Prentice Hall International Series in Computer Science.

[43]  Ralph E. Johnson,et al.  Type-checking Smalltalk , 1986, OOPLSA '86.

[44]  John Cocke,et al.  A methodology for the real world , 1981 .

[45]  Ralph E. Johnson,et al.  TS: an optimizing compiler for smalltalk , 1988, OOPSLA '88.

[46]  Deborah S. Coutant,et al.  DOC: a practical approach to source-level debugging of globally optimized code , 1988, PLDI '88.

[47]  Steve Johnson,et al.  Compiling C for vectorization, parallelization, and inline expansion , 1988, PLDI '88.

[48]  Leon Sterling,et al.  The Art of Prolog , 1987, IEEE Expert.

[49]  Bjarne Stroustrup,et al.  Multiple Inheritance for C++ , 1989, Comput. Syst..

[50]  Peter Wegner,et al.  Dimensions of object-based language design , 1987, OOPSLA '87.

[51]  Paul R. Wilson,et al.  Design of the opportunistic garbage collector , 1989, OOPSLA '89.

[52]  Gregory J. Chaitin,et al.  Register allocation & spilling via graph coloring , 1982, SIGPLAN '82.

[53]  Craig Schaffert,et al.  Abstraction mechanisms in CLU , 1977, Commun. ACM.

[54]  John Cocke,et al.  Register Allocation Via Coloring , 1981, Comput. Lang..

[55]  Wen-mei W. Hwu,et al.  Inline function expansion for compiling C programs , 1989, PLDI '89.

[56]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[57]  Olin Shivers,et al.  Control flow analysis in scheme , 1988, PLDI '88.

[58]  Eric Jul Object mobility in a distributed object-oriented system , 1990 .

[59]  Doug Lea Customization in C++ , 1990, C++ Conference.

[60]  Robert G. Atkinson,et al.  Hurricane: an optimizing compiler for Smalltalk , 1986, OOPLSA '86.

[61]  Gerald J. Sussman,et al.  Structure and interpretation of computer programs , 1985, Proceedings of the IEEE.

[62]  Paul Hudak,et al.  ORBIT: an optimizing compiler for scheme , 1986, SIGPLAN '86.

[63]  Craig Chambers,et al.  Iterative type analysis and extended message splitting; optimizing dynamically-typed object-oriented programs , 1990, PLDI '90.

[64]  Niklaus Wirth,et al.  Program development by stepwise refinement , 1971, CACM.

[65]  Andrew P. Black,et al.  Fine-grained mobility in the Emerald system , 1987, TOCS.

[66]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[67]  Robert H. Halstead,et al.  Mul-T: a high-performance parallel Lisp , 1989, PLDI '89.

[68]  William D. Clinger,et al.  Revised3 report on the algorithmic language scheme , 1986, SIGP.

[69]  V. Stavridou,et al.  Abstraction and specification in program development , 1988 .

[70]  David A. Padua,et al.  Advanced compiler optimizations for supercomputers , 1986, CACM.

[71]  L. Peter Deutsch,et al.  Efficient implementation of the smalltalk-80 system , 1984, POPL.

[72]  Jonathan Rees,et al.  T: a dialect of Lisp or LAMBDA: The ultimate software tool , 1982, LFP '82.

[73]  Scott McFarling,et al.  Procedure merging with instruction caches , 1991, PLDI '91.

[74]  Craig Chambers,et al.  Making pure object-oriented languages practical , 1991, OOPSLA '91.

[75]  Polle Zellweger Lnteractiv~ source-level debugging of optimized programs , 1984 .

[76]  James R. Larus,et al.  Register allocation in the SPUR Lisp compiler , 1986, SIGPLAN '86.

[77]  Richard P. Gabriel,et al.  Performance and evaluation of Lisp systems , 1985 .

[78]  G. L. Steele Common Lisp , 1990 .

[79]  Bertrand Meyer,et al.  Eiffel: The Language , 1991 .

[80]  Greg Nelson,et al.  Systems programming in modula-3 , 1991 .

[81]  Kenneth E. Iverson,et al.  A programming language , 1899, AIEE-IRE '62 (Spring).

[82]  Craig Chambers,et al.  Organizing programs without classes , 1991, LISP Symb. Comput..

[83]  Craig Schaffert,et al.  CLU Reference Manual , 1984, Lecture Notes in Computer Science.

[84]  Niklaus Wirth,et al.  Pascal User Manual and Report , 1991, Springer New York.