PLCC: a programming language compiler compiler

This paper describes PLCC, a compiler-compiler tool to support courses in programming languages, compilers, and computational theory. This tool has proven to be useful for implementing interpreters, building compilers, and creating parsers for context-free languages. PLCC is a Perl program that takes an input file that specifies the tokens, syntax, and semantics of a language and that generates a complete set of Java files that implement the semantics of the language. PLCC stands for "Programming Language Compiler-Compiler". PLCC is not intended to be a production-quality tool. Rather, it supports understanding and implementing the essential elements of lexical analysis, parsing, and semantics without having to wrestle with the complexities of dealing with "industrial-strength" compiler-compiler tools. Students quickly learn how to write PLCC "grammar" files for small languages that have straightforward syntax and semantics and use PLCC to build Java-based parsers, interpreters, or compilers for these languages that run out-of-the-box. Input to PLCC is a text file with a token definition section that defines language tokens as simple regular expressions, a syntax section that specifies the grammar rules of an LL(1) language as simple Backus-Naur Form (BNF) productions, and a semantics section that defines the language semantics as Java methods. PLCC generates a set of Java source files that are entirely self-contained and that import only standard elements of 'java.util' in JDK5 and above. For testing purposes, PLCC generates a read-eval-print loop that (1) reads standard input, (2) scans, parses, and evaluates the input, and (3) prints the evaluation to standard output.