Featherweight TeX and Parser Correctness

TEX (and its LATEX incarnation) is a widely used document preparation system for technical and scientific documents. At the same time, TEX is also an unusual programming language with a quite powerful macro system. Despite the wide range of TEX users (especially in the scientific community), and despite a widely perceived considerable level of "pain" in using TEX, there is almost no research on TEX. This paper is an attempt to change that. To this end, we present Featherweight TEX, a formal model of TEX which we hope can play a similar role for TEX as Featherweight Java did for Java. The main technical problem which we study in terms of Featherweight TEX is the parsing problem. As for other dynamic languages performing syntactic analysis at runtime, the concept of "static" parsing and its correctness is unclear in TEX and shall be clarified in this paper. Moreover, it is the case that parsing TEX is impossible in general, but we present evidence that parsers for practical subsets exists. We furthermore outline three immediate applications of our formalization of TEX and its parsing: a macro debugger, an analysis that detects syntactic inconsistencies, and a test framework for TEX parsers.

[1]  Yoann Padioleau,et al.  Parsing C/C++ Code without Pre-processing , 2009, CC.

[2]  Daniel Weise,et al.  Programmable syntax macros , 1993, PLDI '93.

[3]  Matthias Felleisen,et al.  Hygienic macro expansion , 1986, LFP '86.

[4]  Karel Skoup,et al.  N T S: a New Typesetting System , 1998 .

[5]  Victor Eijkhout,et al.  TeX by Topic: A TeXnician's Reference , 1992 .

[6]  Matthias Felleisen,et al.  A Syntactic Approach to Type Soundness , 1994, Inf. Comput..

[7]  Arun Lakhotia,et al.  Program comprehension , 1999 .

[8]  Panos E. Livadas,et al.  Understanding code containing preprocessor constructs , 1994, Proceedings 1994 IEEE 3rd Workshop on Program Comprehension- WPC '94.

[9]  Philip Wadler,et al.  Featherweight Java: a minimal core calculus for Java and GJ , 1999, OOPSLA '99.

[10]  Mario Latendresse Rewrite systems for symbolic evaluation of C-like preprocessing , 2004, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings..

[11]  David Notkin,et al.  A framework for preprocessor-aware C source code analyses , 2000 .

[12]  Leslie Lamport,et al.  Latex : A Document Preparation System , 1985 .

[13]  Claus Brabrand,et al.  Growing languages with metamorphic syntax macros , 2000, PEPM '02.

[14]  Donald E. Knuth,et al.  The TeXbook , 1984 .

[15]  Donald E. Knuth,et al.  The TEX Book , 1984 .

[16]  Donald E. Knuth,et al.  TeX: The Program , 1986 .

[17]  Marvin V. Zelkowitz,et al.  Programming Languages: Design and Implementation , 1975 .

[18]  Zhendong Su,et al.  Static Validation of C Preprocessor Macros , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[19]  Ralph E. Johnson,et al.  Refactoring C with conditional compilation , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..