MetaLexer: a modular lexical specification language

Compiler toolkits make it possible to rapidly develop compilers and translators for new programming languages. Although there exist elegant toolkits for modular and extensible parsers, compiler developers must often resort to ad-hoc solutions when extending or composing lexers. This paper presents MetaLexer, a new modular lexical specification language and associated tool. MetaLexer allows programmers to define lexers in a modular fashion. MetaLexer modules can be used to break the lexical specification of a language into a collection smaller modular lexical specifications. Control is passed between the modules using the concept of meta-tokens and meta-lexing. MetaLexer modules are also extensible. MetaLexer has two key features: it abstracts lexical state transitions out of semantic actions and it makes modules extensible by introducing multiple inheritance. We have constructed a MetaLexer tool which converts MetaLexer specifications to the popular JFlex lexical specification language and we have used our tool to create lexers for three real programming languages and their extensions: AspectJ (and two AspectJ extensions), MATLAB (and the AspectMatlab extension), and MetaLexer itself. The new specifications are easier to read, are extensible, and require much less action code than the originals.

[1]  Torbjörn Ekman,et al.  The JastAdd system - modular extensible compiler construction , 2007, Sci. Comput. Program..

[2]  José M. Vidal,et al.  Cascading style sheets , 1997, World Wide Web J..

[3]  Torbjörn Ekman,et al.  The jastadd extensible java compiler , 2007, OOPSLA.

[4]  Eric Van Wyk,et al.  Verifiable composition of deterministic grammars , 2009, PLDI '09.

[5]  Ondrej Lhoták,et al.  abc: an extensible AspectJ compiler , 2005, AOSD '05.

[6]  Eelco Visser,et al.  Concrete syntax for objects: domain-specific language embedding and assimilation without restrictions , 2004, OOPSLA '04.

[7]  Eric Van Wyk,et al.  Context-aware scanning for parsing extensible languages , 2007, GPCE '07.

[8]  Terence Parr The Definitive ANTLR Reference: Building Domain-Specific Languages , 2007 .

[9]  William G. Griswold,et al.  An Overview of AspectJ , 2001, ECOOP.

[10]  Claus Brabrand,et al.  The metafront System: Extensible Parsing and Transformation , 2003, LDTA@ETAPS.

[11]  Bernhard Rumpe,et al.  MontiCore: a framework for the development of textual domain specific languages , 2008, ICSE Companion '08.

[12]  Robert Grimm,et al.  Jeannie: granting java native interface developers their wishes , 2007, OOPSLA.

[13]  Robert Grimm,et al.  Better extensibility through modular syntax , 2006, PLDI '06.

[14]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[15]  S. Dmitriev Language Oriented Programming: The Next Programming Paradigm , 2004 .

[16]  Walter Cazzola,et al.  DSL evolution through composition , 2010, RAM-SE@ECOOP.

[17]  E. Schmidt,et al.  Lex—a lexical analyzer generator , 1990 .

[18]  Eelco Visser,et al.  Declarative, formal, and extensible syntax definition for aspectJ , 2006, OOPSLA '06.