A technique for extracting grammar from legacy programs

The grammar of the language in which some given code is written is essential for developing automated tools for maintenance, reengineering, and program analysis. Frequently grammar is available for a language but not for its variants that are implemented by various vendors and in which the given code may be written. In this work we address the problem of obtaining the grammar from source code, which can then be used for generating tools for the programs. We propose an incremental method for obtaining grammar for a particular language variant, from a set of programs written in the language variant and an approximate grammar (presumably of the standard language) with some user interaction. We also present the design of a tool for implementing this approach and our experience in working with grammars of C, C++ and COBOL .

[1]  Boris Burshteyn,et al.  USSA—universal syntax and semantics analyzer , 1992, SIGP.

[2]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[3]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[4]  Chris Verhoef,et al.  Development, assessment, and reengineering of language descriptions , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[5]  Chris Verhoef,et al.  Towards automated modification of legacy assets , 2000, Ann. Softw. Eng..

[6]  Alexander E. Quilici,et al.  DECODE: a co-operative program understanding environment , 1996 .

[7]  Chris Verhoef,et al.  Obtaining a COBOL grammar from legacy code for reengineering purposes , 1997 .

[8]  Linda M. Wills,et al.  Recognizing a program's design: a graph-parsing approach , 1990, IEEE Software.

[9]  Rajesh Parekh,et al.  Grammar Inference Automata Induction and Language Acquisition , 2005 .

[10]  Chris Verhoef,et al.  Generation of software renovation factories from compilers , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[11]  Pier Stanislao Paolucci,et al.  Dynamic parsers and evolving grammars , 1992, SIGP.

[12]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.