Recovering grammar relationships for the Java Language Specification

Grammar convergence is a method that helps in discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent.

[1]  James R. Cordy Generalized selective XML markup of source code using agile parsing , 2003, 11th IEEE International Workshop on Program Comprehension, 2003..

[2]  Ralf Lämmel,et al.  Deriving tolerant grammars from a base-line grammar , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[3]  Faizan Javed,et al.  Extracting grammar from programs: evolutionary approach , 2005, SIGP.

[4]  Ralf Lämmel,et al.  Towards an engineering discipline for GRAMMARWARE Draft as of August 17 , 2003 , 2003 .

[5]  Ramez Elmasri,et al.  Entity-Relationship Approach - ER '93: 12th International Conference on the Entity-Relationship Approach, Arlington, Texas, USA, December 15 - 17, 1993. Proceedings , 1994 .

[6]  Jean Bézivin,et al.  TCS:: a DSL for the specification of textual concrete syntaxes in model engineering , 2006, GPCE '06.

[7]  David S. Wile,et al.  Abstract Syntax from Concrete Syntax , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[8]  Anthony Cleve,et al.  Co-transformations in Database Applications Evolution , 2005, GTTSE.

[9]  Ralf Lämmel,et al.  Mappings Make Data Processing Go 'Round , 2005, GTTSE.

[10]  Antonio Cicchetti,et al.  Automating Co-evolution in Model-Driven Engineering , 2008, 2008 12th International IEEE Enterprise Distributed Object Computing Conference.

[11]  Richard C. Holt,et al.  Hierarchic syntax error repair for LR grammars , 1982, International Journal of Computer & Information Sciences.

[12]  Ralf Lämmel,et al.  Cracking the 500-Language Problem , 2001, IEEE Softw..

[13]  Jean-Luc Hainaut,et al.  Schema Transformation Techniques for Database Reverse Engineering , 1993, ER.

[14]  T. Dean,et al.  Agile Parsing Techniques for Web Applications , 2005 .

[15]  Brian A. Malloy,et al.  An Automated Approach to Grammar Recovery for a Dialect of the C++ Language , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[16]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[17]  Guy L. Steele,et al.  Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley)) , 2005 .

[18]  C. A. R. Hoare,et al.  Proof of correctness of data representations , 1972, Acta Informatica.

[19]  Leon Moonen Lightweight impact analysis using island grammars , 2002, Proceedings 10th International Workshop on Program Comprehension.

[20]  Alpana Dubey,et al.  Technique for extracting keyword based rules from a set of programs , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[21]  Joost Visser,et al.  A Case Study in Grammar Engineering , 2008, SLE.

[22]  Udo Kelter,et al.  Analyzing model evolution , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[23]  Guy L. Steele,et al.  Java Language Specification, Second Edition: The Java Series , 2000 .

[24]  LämmelRalf,et al.  Toward an engineering discipline for grammarware , 2005 .

[25]  Kevin A. Schneider,et al.  Agile Parsing in TXL , 2004, Automated Software Engineering.

[26]  Guido Wachsmuth,et al.  Metamodel Adaptation and Model Co-adaptation , 2007, ECOOP.

[27]  Brian A. Malloy,et al.  Applying software engineering techniques to parser design: the development of a C # parser , 2002 .

[28]  Alpana Dubey,et al.  A deterministic technique for extracting keyword based grammar rules from programs , 2006, SAC '06.

[29]  Sander Vermolen,et al.  Heterogeneous Coupled Evolution of Software Languages , 2008, MoDELS.

[30]  Thomas R. Dean,et al.  Agile Parsing to Transform Web Applications , 2005, GTTSE.

[31]  Ralf Lämmel,et al.  Semi‐automatic grammar recovery , 2001, Softw. Pract. Exp..

[32]  Luigi Troiano,et al.  Search-based inference of dialect grammars , 2007, Soft Comput..

[33]  Carroll Morgan,et al.  Programming from specifications , 1990, Prentice Hall International Series in computer science.

[34]  Alpana Dubey,et al.  Learning context-free grammar rules from a set of program , 2008, IET Softw..

[35]  Ralf Lämmel,et al.  Transformation of SDF syntax definitions in the ASF+SDF Meta-Environment , 2001, LDTA@ETAPS.

[36]  Ralf Lämmel,et al.  An Introduction to Grammar Convergence , 2009, IFM.

[37]  Jácome Cunha,et al.  From spreadsheets to relational databases and back , 2009, PEPM '09.

[38]  Eelco Visser,et al.  Grammar Engineering Support for Precedence Rule Recovery and Compatibility Checking , 2007, LDTA@ETAPS.

[39]  Oscar Nierstrasz,et al.  Example-Driven Reconstruction of Software Models , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[40]  Dave A. Thomas The Impedance Imperative - Tuples + Objects + Infosets = Too Much Stuff! , 2003, J. Object Technol..

[41]  Marjan Mernik,et al.  On defining quality based grammar metrics , 2009, 2009 International Multiconference on Computer Science and Information Technology.

[42]  Chris Verhoef,et al.  Development, assessment, and reengineering of language descriptions , 1998, Proceedings 13th IEEE International Conference on Automated Software Engineering (Cat. No.98EX239).

[43]  Erhard Rahm,et al.  Matching large schemas: Approaches and evaluation , 2007, Inf. Syst..

[44]  Ralf Lämmel The Amsterdam Toolkit for Language Archaeology , 2005, Electron. Notes Theor. Comput. Sci..

[45]  Andrew Jones Defining quality. , 2010, The Health service journal.

[46]  James R. Cordy,et al.  Robust multilingual parsing using island grammars , 2003, CASCON.

[47]  Viljem Zumer,et al.  Can a parser be generated from examples? , 2003, SAC '03.

[48]  Ralf Lämmel,et al.  Generative and Transformational Techniques in Software Engineering, International Summer School, GTTSE 2005, Braga, Portugal, July 4-8, 2005. Revised Papers , 2006, GTTSE.

[49]  Jean-Luc Hainaut,et al.  Transformation-based Database Reverse Engineering , 1993 .

[50]  Joost Visser,et al.  Coupled schema transformation and data conversion for XML and SQL , 2007 .

[51]  José Nuno Oliveira Transforming Data by Calculation , 2007, GTTSE.

[52]  Alpana Dubey,et al.  Inferring Grammar Rules of Programming Language Dialects , 2006, ICGI.

[53]  Eleni Stroulia,et al.  Refactoring Detection based on UMLDiff Change-Facts Queries , 2006, 2006 13th Working Conference on Reverse Engineering.

[54]  Massimiliano Di Penta,et al.  Towards the automatic evolution of reengineering tools , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[55]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[56]  Leon Moonen,et al.  Generating robust parsers using island grammars , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[57]  Ralf Lämmel,et al.  Grammar Adaptation , 2001, FME.

[58]  Brian A. Malloy,et al.  Grammar Recovery from Parse Trees and Metrics-Guided Grammar Refactoring , 2009, IEEE Transactions on Software Engineering.

[59]  Kevin A. Schneider,et al.  Grammar programming in TXL , 2002, Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation.

[60]  José Nuno Oliveira,et al.  Strategic Term Rewriting and Its Application to a VDMSL to SQL Conversion , 2005, FM.

[61]  Eelco Visser,et al.  Syntax definition for language prototyping , 1997 .

[62]  Merijn de Jonge,et al.  Cost-effective maintenance tools for proprietary languages , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[63]  Clémentine Nebut,et al.  Metamodel Matching for Automatic Model Transformation Generation , 2008, MoDELS.