Leveraging program analysis for Web site reverse engineering

Web sites are complex and heterogeneous systems, characterized by a large number of employed technologies. Evolving these systems requires the skills of a "renaissance reverse engineer". In order to assist reverse engineers in their efforts, new program analyses need to be developed that are specifically tailored to the unique task of Web site reverse engineering. To illustrate the design space for program analyses, we introduce a classification based on dichotomies and discuss each of them in the light of Web site reverse engineering. The main contribution of the paper is a better understanding of the program analyses features for Web site reverse engineering.

[1]  Richard C. Holt,et al.  Architecture recovery of web applications , 2002, ICSE '02.

[2]  Thomas Kistler,et al.  WebL - A Programming Language for the Web , 1998, Comput. Networks.

[3]  John White ACM Opens portal , 2001, CACM.

[4]  J. Isaak Digital Toolbox: IEEE Standard 2001: Web Page Engineering for Intranets and Extranets , 1999, IEEE Internet Comput..

[5]  Daniel Jackson,et al.  Software analysis: a roadmap , 2000, ICSE '00.

[6]  Hausi A. Müller,et al.  Shimba—an environment for reverse engineering Java software systems , 2001, Softw. Pract. Exp..

[7]  Cornelia Boldyreff,et al.  The evolution of Websites , 1999, Proceedings Seventh International Workshop on Program Comprehension.

[8]  Arnaud Sahuguet,et al.  Building intelligent Web applications using lightweight wrappers , 2001, Data Knowl. Eng..

[9]  Ying Zou,et al.  Towards a Web-centric Legacy System Migration Framework , 2001 .

[10]  Claus Brabrand,et al.  Static validation of dynamically generated HTML , 2001, PASTE '01.

[11]  Wolfgang Emmerich,et al.  xlinkit: links that make sense , 2001 .

[12]  P. Fraternali Tools and Approaches for Data Intensive Web Application Development: a Survey , 1999 .

[13]  Thomas Ball,et al.  Mawl: A Domain-Specific Language for Form-Based Services , 1999, IEEE Trans. Software Eng..

[14]  Athula Ginige,et al.  Guest Editors' Introduction: Web Engineering - An Introduction , 2001, IEEE Multim..

[15]  Paolo Tonella,et al.  Building a Tool for the Analysis and Testing of Web Applications: Problems and Solutions , 2001, TACAS.

[16]  Paolo Tonella,et al.  Understanding and Restructuring Web Sites with ReWeb , 2001, IEEE Multim..

[17]  Udi Manber,et al.  Experience with personalization of Yahoo! , 2000, CACM.

[18]  Mehdi Jazayeri,et al.  Experiences in Engineering Flexible Web Services , 2001, IEEE Multim..

[19]  Cristina Cachero,et al.  Conceptual Modeling of Device-Independent Web Applications , 2001, JISBD.

[20]  Hausi A. Müller,et al.  The Year 2000 Problem: Issues and Implications. , 1997 .

[21]  Hausi A. Müller,et al.  Reverse engineering: a roadmap , 2000, ICSE '00.

[22]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[23]  David Notkin,et al.  Lightweight lexical source model extraction , 1996, TSEM.

[24]  Johannes Martin,et al.  Web site maintenance with software-engineering tools , 2001, Proceedings 3rd International Workshop on Web Site Evolution. WSE 2001.

[25]  Scott R. Tilley,et al.  Spreading knowledge about Gnutella: a case study in understanding net-centric applications , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[26]  Jürgen Ebert,et al.  Program comprehension in multi-language systems , 1998, Proceedings Fifth Working Conference on Reverse Engineering (Cat. No.98TB100261).

[27]  Dan Suciu,et al.  Declarative specification of data-intensive Web sites , 1999, DSL '99.

[28]  Paolo Tonella,et al.  Web site analysis: structure and evolution , 2000, Proceedings 2000 International Conference on Software Maintenance.

[29]  Emilia Mendes,et al.  Web Metrics-Estimating Design and Authoring Effort , 2001, IEEE Multim..

[30]  David Raggett Clean Up Your Web Pages with HP's HTML Tidy , 1998, Comput. Networks.

[31]  Shihong Huang,et al.  Evaluating the reverse engineering capabilities of Web tools for understanding site content and structure: a case study , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[32]  Johan Moe,et al.  Understanding distributed systems via execution trace data , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[33]  David Notkin,et al.  An empirical study of static call graph extractors , 1998, TSEM.

[34]  Yoram Wind,et al.  The challenge of “customerization” in financial services , 2001, CACM.

[35]  Abhay Gupta,et al.  Implementing Java Computing: Sun on Architecture and Applications Development , 1998, IEEE Internet Comput..

[36]  Scott M. Lewandowski,et al.  Frameworks for component-based client/server computing , 1998, CSUR.