Program analysis and transformation for data-intensive system evolution

Data-intensive software systems are generally made of a database and a collection of application programs in strong interaction with the former. They constitute critical assets in most enterprises, since they support business activities in all production and management domains. Data-intensive systems form most of the so-called legacy systems: they typically are one or more decades old, they are very large, heterogeneous and highly complex. Many of them significantly resist modifications and change due to the lack of documentation, to the use of aging technologies and to inflexible architectures. Therefore, the evolution of data-intensive systems clearly calls for automated support. This thesis explores the use of automated program analysis and transformation techniques in support to the evolution of the database component of the system. The program analysis techniques aim to ease the database evolution process, by helping the developers to understand the data structures that are to be changed, despite the lack of precise and up-to-date documentation. The objective of the program transformation techniques is to support the adaptation of the application programs to the new database. This adaptation process is studied in the context of two realistic database evolution scenarios, namely database database schema refactoring and database platform migration.

[1]  Tom Mens,et al.  Data-Intensive System Evolution , 2010, Computer.

[2]  D. R. Harris,et al.  Recovering abstract data types and object instances from a conventional procedural language , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[3]  Bing Wu,et al.  Legacy Information Systems: Issues and Directions , 1999, IEEE Softw..

[4]  Ralf Lämmel,et al.  Semi‐automatic grammar recovery , 2001, Softw. Pract. Exp..

[5]  Giuseppe A. Di Lucca,et al.  Migrating legacy systems towards object-oriented platforms , 1997, 1997 Proceedings International Conference on Software Maintenance.

[6]  Jean-Francois Girard,et al.  A Metric-Based Approach to Detect Abstract Data Types and State Encapsulations , 2004, Automated Software Engineering.

[7]  Joost Visser Coupled Transformation of Schemas, Documents, Queries, and Constraints , 2008, Electron. Notes Theor. Comput. Sci..

[8]  Jean-Marc Petit,et al.  Discovery of "Interesting" Data Dependencies from a Workload of SQL Statements , 1999, PKDD.

[9]  Daniel Jackson,et al.  A new model of program dependences for reverse engineering , 1994, SIGSOFT '94.

[10]  Harry M. Sneed,et al.  Integrating legacy software into a service oriented architecture , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[11]  Jean-Marc Petit,et al.  Using Queries to Improve Database Reverse Engineering , 1994, ER.

[12]  Giuliano Antoniol,et al.  Insider and Ousider Threat-Sensitive SQL Injection Vulnerability Analysis in PHP , 2006, 2006 13th Working Conference on Reverse Engineering.

[13]  Anthony Cleve,et al.  Wrapper-based System Evolution Application to CODASYL to Relational Migration , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[14]  Hiralal Agrawal On slicing programs with jump statements , 1994, PLDI '94.

[15]  Yuan Zhao,et al.  Automated elicitation of inclusion dependencies from the source code for database transactions , 2003, J. Softw. Maintenance Res. Pract..

[16]  Rokia Missaoui,et al.  Migrating to an Object-Oriented Database Using Semantic Clustering and Transformation Rules , 1998, Data Knowl. Eng..

[17]  Jurgen Vinju,et al.  Rewriting with Layout , 2012 .

[18]  Norihisa Doi,et al.  SPiCE: A System for Translating Smalltalk Programs Into a C Environment , 1995, IEEE Trans. Software Eng..

[19]  H.A. Muller,et al.  Strategies for migration from C to Java , 2001, Proceedings Fifth European Conference on Software Maintenance and Reengineering.

[20]  Anthony Cleve,et al.  An Industrial Experience Report on Legacy Data-Intensive System Migration , 2007, ICSM.

[21]  Sander Vermolen,et al.  Heterogeneous Coupled Evolution of Software Languages , 2008, MoDELS.

[22]  Doris L. Carver,et al.  Reengineering legacy systems for distributed environments , 2002, J. Syst. Softw..

[23]  Niels P. Veerman Automated mass maintenance of a software portfolio , 2006, Sci. Comput. Program..

[24]  Jean-Luc Hainaut,et al.  Specification Preservation in Schema Transformations - Application to Semantics and Statistics , 1996, Data Knowl. Eng..

[25]  Mark Harman,et al.  Tool-Supported Refactoring of Existing Object-Oriented Code into Aspects , 2006, IEEE Transactions on Software Engineering.

[26]  Michael Stonebraker,et al.  Migrating Legacy Systems: Gateways, Interfaces, and the Incremental Approach , 1995 .

[27]  Leon Moonen,et al.  Java quality assurance by detecting code smells , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[28]  Jianhua Shao,et al.  Querying Data-Intensive Programs for Data Design , 2001, CAiSE.

[29]  David Eichmann,et al.  Program and interface slicing for reverse engineering , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[30]  Jean-Luc Hainaut,et al.  Database application evolution: A transformational approach , 2006, Data Knowl. Eng..

[31]  Scott J. Ambler,et al.  Refactoring Databases: Evolutionary Database Design , 2006 .

[32]  Robert Balzer,et al.  Tolerating Inconsistency , 1991, [1989] Proceedings of the 5th International Software Process Workshop.

[33]  Martin Andersson,et al.  Searching for Semantics in COBOL Legacy Applications , 1997, DS-7.

[34]  Gregor Engels,et al.  SQL/EER - syntax and semantics of an Entity-Relationship-based query language , 1992, Inf. Syst..

[35]  Michael Lawley,et al.  A Query Language for EER Schemas , 1994, Australasian Database Conference.

[36]  Anthony Cleve,et al.  The Role of Implicit Schema Constructs in Data Quality , 2008, QDB/MUD.

[37]  Manfred A. Jeusfeld,et al.  An Executable Meta Model for Re-Engineering of Database Schemas , 1994, Int. J. Cooperative Inf. Syst..

[38]  Jianhua Shao,et al.  Program slicing in the presence of database state , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[39]  Giuseppe Visaggio,et al.  Journal of Software Maintenance and Evolution: Research and Practice Ageing of a Data-intensive Legacy System: Symptoms and Remedies , 2022 .

[40]  Jianhua Shao,et al.  Assisting the comprehension of legacy transactions , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[41]  Jeffrey D. Ullman,et al.  A Query Translation Scheme for Rapid Implementation of Wrappers , 1995, DOOD.

[42]  Tok Wang Ling,et al.  Exploring into Programs for the Recovery of Data Dependencies Designed , 2002, IEEE Trans. Knowl. Data Eng..

[43]  M. Ceccato,et al.  Applying and combining three different aspect Mining Techniques , 2006, Software Quality Journal.

[44]  William G. Griswold,et al.  An Overview of AspectJ , 2001, ECOOP.

[45]  Carlo Curino,et al.  Graceful database schema evolution: the PRISM workbench , 2008, Proc. VLDB Endow..

[46]  Keith Brian Gallagher,et al.  Using Program Slicing in Software Maintenance , 1991, IEEE Trans. Software Eng..

[47]  Joost Visser,et al.  Quality Assessment for Embedded SQL , 2007, Seventh IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2007).

[48]  William J. Premerlani,et al.  Observed idiosyncracies of relational database designs , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[49]  Anthony Cleve,et al.  Data Reverse Engineering using System Dependency Graphs , 2006, 2006 13th Working Conference on Reverse Engineering.

[50]  George Papastefanatos,et al.  Hecataeus: A What-If Analysis Tool for Database Schema Evolution , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[51]  Gary DeWard Brown Advanced COBOL for structured and object-oriented programming , 1999 .

[52]  Arie van Deursen,et al.  The ASF+SDF Meta-environment: A Component-Based Language Development Environment , 2001 .

[53]  Christopher W. Pidgeon,et al.  DMS®: Program Transformations for Practical Scalable Software Evolution , 2002, IWPSE '02.

[54]  Houari A. Sahraoui,et al.  A Concept Formation Based Approach to Object Identification in Procedural Code , 1999, Automated Software Engineering.

[55]  Ramez Elmasri,et al.  Fundamentals of Database Systems, 2nd Edition , 1994 .

[56]  Vincent Englebert,et al.  Database reverse engineering: From requirements to CARE tools , 2004, Automated Software Engineering.

[57]  William C. Chu,et al.  Acquisition of Entity Relationship Models for Maintenance-Dealing with Data Intensive Programs in a Transformation System , 1999, J. Inf. Sci. Eng..

[58]  R. H. Cooper,et al.  File techniques for data base organization in COBOL (2nd ed.) , 1986 .

[59]  Anthony Cleve,et al.  Migration of Legacy Information Systems , 2008, Software Evolution.

[60]  Anthony Cleve,et al.  Co-transformations in Database Applications Evolution , 2005, GTTSE.

[61]  Harry M. Sneed Encapsulation of legacy software: A technique for reusing legacy software components , 2000, Ann. Softw. Eng..

[62]  Ian Warren,et al.  The Renaissance of Legacy Systems: Method Support for Software-System Evolution , 1999 .

[63]  Carlo Curino,et al.  Schema Evolution in Wikipedia - Toward a Web Information System Benchmark , 2008, ICEIS.

[64]  J. Mylopoulos,et al.  Code migration through transformations: an experience report , 2010, CASCON.

[65]  A. Malton The Software Migration Barbell , 2001 .

[66]  Philippe Thiran,et al.  Wrapper-based evolution of legacy information systems , 2006, TSEM.

[67]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[68]  Joost Visser,et al.  Coupled schema transformation and data conversion for XML and SQL , 2007 .

[69]  Benjamin Livshits,et al.  Securing web applications with static and dynamic information flow tracking , 2008, PEPM '08.

[70]  Arie van Deursen,et al.  Industrial Applications of ASF+SDF , 1996, AMAST.

[71]  Oreste Signore,et al.  Reconstruction of ER Schema from Database Applications: a Cognitive Approach , 1994, ER.

[72]  Mohammad El-Ramly,et al.  An Experiment in Automatic Conversion of Legacy Java Programs to C# , 2006, IEEE International Conference on Computer Systems and Applications, 2006..

[73]  Colin Potts,et al.  Software-engineering research revisited , 1993, IEEE Software.

[74]  Andreas Meier,et al.  Hierarchical to relational database migration , 1994, IEEE Software.

[75]  Richard C. Waters Program Translation via Abstraction and Reimplementation , 1988, IEEE Trans. Software Eng..

[76]  Bing Wu,et al.  Legacy System Migration : A Legacy Data Migration Engine , 1997 .

[77]  Anthony Cleve,et al.  Dynamic Analysis of SQL Statements for Data-Intensive Applications Reverse Engineering , 2008, 2008 15th Working Conference on Reverse Engineering.

[78]  Magiel Bruntink,et al.  Renovation of idiomatic crosscutting concerns in embedded systems , 2005 .

[79]  Jens H. Jahnke,et al.  Varlet: Human-Centered Tool Support for Database Reengineering , 2000 .

[80]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[81]  Kim Mens,et al.  Mining aspectual views using formal concept analysis , 2004, Source Code Analysis and Manipulation, Fourth IEEE International Workshop on.

[82]  Mark van den Brand,et al.  A language independent framework for context-sensitive formatting , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[83]  José Nuno Oliveira,et al.  Type-Safe Two-Level Data Transformation , 2006, FM.

[84]  James R. Cordy,et al.  WAFA: Fine-grained dynamic analysis of web applications , 2009, 2009 11th IEEE International Symposium on Web Systems Evolution.

[85]  Vincent Englebert,et al.  Database Reverse Engineering , 2009 .

[86]  Eelco Visser,et al.  Stratego/XT 0.17. A language and toolset for program transformation , 2008, Sci. Comput. Program..

[87]  Ralf Lämmel,et al.  What does aspect-oriented programming mean to Cobol? , 2005, AOSD '05.

[88]  Jean-Luc Hainaut Legacy and Future of Data Reverse Engineering , 2009, 2009 16th Working Conference on Reverse Engineering.

[89]  Hee Beng Kuan Tan,et al.  Applying static analysis for automated extraction of database interactions in web applications , 2008, Inf. Softw. Technol..

[90]  Vincent Englebert,et al.  Knowledge transfer in database reverse engineering: a supporting case study , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[91]  Mariano Ceccato,et al.  Aspect mining through the formal concept analysis of execution traces , 2004, 11th Working Conference on Reverse Engineering.

[92]  Alessandro Bianchi,et al.  Method and process for iterative reengineering of data in a legacy system , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[93]  Ralf Lämmel Transformations everywhere , 2004, Sci. Comput. Program..

[94]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[95]  Alessandro Orso,et al.  Combining static analysis and runtime monitoring to counter SQL-injection attacks , 2005, ACM SIGSOFT Softw. Eng. Notes.

[96]  Klaus R. Dittrich,et al.  On the Migration of Relational Schemas and Data to Object-OrientedDatabase Systems , 1997 .

[97]  Vincent Englebert,et al.  Program Understanding in Databases Reverse Engineering , 1998, DEXA.

[98]  Andreas Meier Providing Database Migration Tools - A Practicioner's Approach , 1995, VLDB.

[99]  Arie van Deursen,et al.  Simple crosscutting concerns are not so simple: analysing variability in large-scale idioms-based implementations , 2007, AOSD.

[100]  Jean-Luc Hainaut Network Data Model , 2009, Encyclopedia of Database Systems.

[101]  Eleni Stroulia,et al.  User Interface Reverse Engineering in Support of Interface Migration to the Web , 2003, Automated Software Engineering.

[102]  Jean-Luc Hainaut,et al.  A Generic Entity-Relationship Model , 1989, ISCO.

[103]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[104]  Anthony Cleve,et al.  Automating program conversion in database reengineering: a wrapper-based approach , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[105]  Jean-Luc Hainaut,et al.  Contribution to a theory of database reverse engineering , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[106]  Ying Zou,et al.  A framework for migrating procedural code to object-oriented platforms , 2001, Proceedings Eighth Asia-Pacific Software Engineering Conference.

[107]  Niels P. Veerman Revitalizing modifiability of legacy assets , 2004, J. Softw. Maintenance Res. Pract..

[108]  Anthony Cleve,et al.  Large-Scale Data Reengineering: Return from Experience , 2008, 2008 15th Working Conference on Reverse Engineering.

[109]  Gerardo Canfora,et al.  Developing and executing java AWT applications on limited devices with TCPTE , 2006, ICSE '06.

[110]  Joost Visser,et al.  Constraint-aware Schema Transformation , 2012, Electron. Notes Theor. Comput. Sci..

[111]  Arie van Deursen,et al.  Identifying aspects using fan-in analysis , 2004, 11th Working Conference on Reverse Engineering.

[112]  Charles W. Bachman Why restrict the modelling capability of CODASYL data structure sets , 1899 .

[113]  James H. Cross,et al.  Reverse engineering and design recovery: a taxonomy , 1990, IEEE Software.

[114]  Günter Riedewald,et al.  Towards automatical migration of transformation rules after grammar extension , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[115]  Gio Wiederhold,et al.  Modelling and System Maintenance , 1995, OOER.

[116]  Kevin A. Schneider,et al.  Source transformation in software engineering using the TXL transformation system , 2002, Inf. Softw. Technol..

[117]  Jean-Marc Petit,et al.  Relational Database Reverse Engineering: A Method Based on Query Analysis , 1995, Int. J. Cooperative Inf. Syst..

[118]  Niels P. Veerman,et al.  Automated Mass Maintenance of Software Assets , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[119]  Paul Klint,et al.  Using The Meta-Environment for Maintenance and Renovation , 2007, 11th European Conference on Software Maintenance and Reengineering (CSMR'07).

[120]  Tok Wang Ling,et al.  Correct Program Slicing of Database Operations , 1998, IEEE Softw..

[121]  Paul Klint,et al.  Term rewriting with traversal functions , 2003, TSEM.

[122]  Chris Verhoef,et al.  Scaffolding for software renovation , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[123]  Reiko Heckel,et al.  Architectural Transformations: From Legacy to Three-Tier and Services , 2008, Software Evolution.

[124]  Nicola Vitiello,et al.  A Strategy and an Eclipse Based Environment for the Migration of Legacy Systems to Multi-tier Web-based Architectures , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[125]  R. Lämmel,et al.  Crossing the Rubicon of API Migration , 2009 .

[126]  Gilles Dowek,et al.  Principles of programming languages , 1981, Prentice Hall International Series in Computer Science.

[127]  A. Maule,et al.  Impact analysis of database schema changes , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[128]  Randy H. Katz,et al.  Decompiling CODASYL DML into retional queries , 1982, TODS.

[129]  Liam O'Brien,et al.  Supporting Migration to Services using Software Architecture Reconstruction , 2005, 13th IEEE International Workshop on Software Technology and Engineering Practice (STEP'05).

[130]  Iulian Neamtiu,et al.  Collateral evolution of applications and databases , 2009, IWPSE-Evol '09.

[131]  Arie van Deursen,et al.  Rapid system understanding: Two COBOL case studies , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[132]  Gerardo Canfora,et al.  An improved algorithm for identifying objects in code , 1996 .

[133]  Magiel Bruntink,et al.  Reengineering Idiomatic Exception Handling in Legacy C Code , 2008, 2008 12th European Conference on Software Maintenance and Reengineering.

[134]  Mark Harman,et al.  An overview of program slicing , 2001, Softw. Focus.

[135]  Pierre-Etienne Moreau,et al.  Environments for Term Rewriting Engines for Free! , 2003, RTA.

[136]  Thomas Ball,et al.  Slicing Programs with Arbitrary Control-flow , 1993, AADEBUG.

[137]  Arie van Deursen,et al.  On the use of clone detection for identifying crosscutting concern code , 2005, IEEE Transactions on Software Engineering.

[138]  Vincent Englebert,et al.  Database Design Recovery , 1996, CAiSE.

[139]  Chris Verhoef,et al.  The Realities of Language Conversions , 2000, IEEE Softw..

[140]  K. Menhoudj,et al.  Migrating Data-Oriented Applications to a Relational Database Management System , 1996, ADBIS.

[141]  Thomas Reps,et al.  The synthesizer generator , 1984 .

[142]  Kim Mens,et al.  Pitfalls in Aspect Mining , 2008, 2008 15th Working Conference on Reverse Engineering.