Cliche recognition in legacy software: a scalable, knowledge-based approach

Many software reverse engineering techniques that are sufficiently "light-weight" (i.e. computationally inexpensive) to be able to work on large systems tend to compute syntactic information that, while useful, does not capture the meaning of the program. At the same time, many "heavy-weight" (i.e. computationally expensive) techniques that compute information in terms of human strategies hidden in the software tend not to be efficient enough to work on large real-world systems. We are working on applying a heavy-weight technique of program cliche recognition to the real-world problem of software reverse engineering. This paper presents our approach to program cliche recognition and focuses on issues of scalability, robustness and human-system interaction. We demonstrate the approach by describing how it can be applied to the reverse engineering of a real-world software system.

[1]  Elliot Soloway,et al.  PROUST: Knowledge-Based Program Understanding , 1984, IEEE Transactions on Software Engineering.

[2]  Ted J. Biggerstaff,et al.  Program understanding and the concept assignment problem , 1994, CACM.

[3]  Gordon I. McCalla,et al.  A Computational Framework for Granularity and its Application to Educational Diagnosis , 1989, IJCAI.

[4]  M. T. Harandi,et al.  A knowledge-based approach to automatic program analysis , 1989 .

[5]  Hausi A. Müller,et al.  Rigi: a system for programming-in-the-large , 1988, Proceedings. [1989] 11th International Conference on Software Engineering.

[6]  Linda M. Wills,et al.  Flexible control for program recognition , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[7]  Atul Prakash,et al.  A Framework for Source Code Search Using Program Patterns , 1994, IEEE Trans. Software Eng..

[8]  David Notkin,et al.  Lightweight lexical source model extraction , 1996, TSEM.

[9]  Linda M. Wills Automated Program Recognition: A Feasibility Demonstration , 1990, Artif. Intell..

[10]  Alex Quilici,et al.  Some experiments toward understanding how program plan recognition algorithms scale , 1996, Proceedings of WCRE '96: 4rd Working Conference on Reverse Engineering.

[11]  David N. Chin,et al.  DECODE: a cooperative environment for reverse-engineering legacy software , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[12]  C. V. Ramamoorthy,et al.  The C Information Abstraction System , 1990, IEEE Trans. Software Eng..