Formalising, improving, and reusing the Java module system

Java has no module system. Its packages only subdivide the class namespace, allowing only a very limited form of component-level information hiding. A Java Community Process has started designing the Java Module System, a module system for Java 7, the next version of Java. The extensive draft of the design is written only in natural language documents, which inevitably contain many ambiguities. We design and formalise LJAM, a core of the module system. Where the informal documents are complete, we follow them closely; elsewhere, we make reasonable choices. We define the syntax, the type system, and the operational semantics of this core, defining these rigorously in the Isabelle/HOL automated proof assistant. We highlight the underlying design decisions, and discuss several alternatives and their benefits. Through analysis of our formalisation, we identify two major deficiencies of the module system: (a) its class resolution is unintuitive, insufficiently expressive, and fragile against incremental interface evolution; and (b) only a single instance of each module is permitted, which forces sharing of data and types, and so makes it difficult to reason about module invariants. We propose modest changes to the module language, and to the semantics of the class resolution, which together allow the module system to handle more scenarios in a clean and predictable manner. To develop confidence, both theoretical and practical, in our proposals, we (a) formalise in Ott the improved module system, iJAM, (b) prove in Isabelle/HOL mechanised type soundness results, and (c) give a proof-of-concept implementation in Java that closely follows the formalisation. Both of the formalisations, LJAM and iJAM, are based on Lightweight Java (LJ), our minimal imperative fragment of Java. LJ has been a good base language, allowing a high reuse of the definitions and proof scripts, which made it possible to carry out this development relatively quickly, on the timescale of the language evolution process. Finally, we develop a module system for Thorn, an emerging, Java-like language aimed at distributed environments. We find that local aliasing and module-prefixed type references remove the need for boundary renaming, and that in the presence of multiple module instances, care is required to avoid ambiguities at de-serialisation. We conclude with a high-level overview of the interactions between the desired properties and the language features considered, and discuss possible future directions.

[1]  Dennis Ritchie,et al.  The development of the C language , 1993, HOPL-II.

[2]  Tobias Nipkow,et al.  Javalight is type-safe—definitely , 1998, POPL '98.

[3]  Viktor Vafeiadis,et al.  Acute: High-level programming language design for distributed computation , 2007, J. Funct. Program..

[4]  Sheng Liang,et al.  Dynamic class loading in the Java virtual machine , 1998, OOPSLA '98.

[5]  Baojian Hua An Imperative Core Calculus for Java ∗ , 2008 .

[6]  Tom Ridge,et al.  Ott: effective tool support for the working semanticist , 2007, ICFP '07.

[7]  Karl Crary,et al.  What is a recursive module? , 1999, PLDI '99.

[8]  Scott Owens,et al.  A Sound Semantics for OCamllight , 2008, ESOP.

[9]  Tobias Nipkow,et al.  A Proof Assistant for Higher-Order Logic , 2002 .

[10]  Susan Eisenbach,et al.  Flexible Dynamic Linking for . NET , 2005 .

[11]  Marcin Zalewski,et al.  Generic Programming with Concepts , 2008 .

[12]  Viswanathan Kodaganallur,et al.  Incorporating language processing into Java applications: a JavaCC tutorial , 2004, IEEE Software.

[13]  David F. Bacon,et al.  MJ: a rational module system for Java and its applications , 2003, OOPSLA 2003.

[14]  Jan Vitek,et al.  Thorn: robust, concurrent, extensible scripting on the JVM , 2009, OOPSLA '09.

[15]  Rok Strniša Fixing the Java Module System , in Theory and in Practice , 2008 .

[16]  William R. Cook,et al.  A machine-checked model of safe composition , 2009, FOAL '09.

[17]  Matthias Felleisen,et al.  Units: cool modules for HOT languages , 1998, PLDI.

[18]  Tobias Nipkow,et al.  A machine-checked model for a Java-like language, virtual machine, and compiler , 2006, TOPL.

[19]  Matthew J. Parkinson,et al.  The java module system: core design and semantic definition , 2007, OOPSLA.

[20]  Benjamin C. Pierce,et al.  Design considerations for ML-style module systems , 2005 .

[21]  L. Erlikh,et al.  Leveraging legacy system dollars for e-business , 2000 .

[22]  David B. MacQueen Modules for standard ML , 1984, LFP '84.

[23]  Peter Sewell,et al.  Type-safe distributed programming for OCaml , 2006, ML '06.

[24]  Robert Bruce Findler,et al.  An operational semantics for R5RS Scheme , 2005 .

[25]  William R. Cook,et al.  Mixin-based inheritance , 1990, OOPSLA/ECOOP '90.

[26]  Robin Milner,et al.  Definition of standard ML , 1990 .

[27]  Guy L. Steele,et al.  Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley)) , 2005 .

[28]  Manfred Broy,et al.  Software Pioneers: Contributions to Software Engineering , 2002 .

[29]  Philip Wadler,et al.  Featherweight Java: a minimal core calculus for Java and GJ , 1999, OOPSLA '99.

[30]  D. L. Parnas,et al.  On the criteria to be used in decomposing systems into modules , 1972, Software Pioneers.

[31]  Matthew Flatt,et al.  Jiazzi: new-age components for old-fasioned Java , 2001, OOPSLA '01.

[32]  Martín Abadi,et al.  A Theory of Objects , 1996, Monographs in Computer Science.

[33]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[34]  Karl Crary,et al.  Towards a mechanized metatheory of standard ML , 2007, POPL '07.

[35]  Xavier Leroy,et al.  Java bytecode verification : algorithms and formalizations Xavier Leroy INRIA Rocquencourt and Trusted Logic , 2003 .

[36]  Don Syme,et al.  Proving Java Type Soundness , 1999, Formal Syntax and Semantics of Java.