Quality of the Source Code for Design and Architecture Recovery Techniques: Utilities are the Problem

Software maintenance is perhaps one of the most difficult activities in software engineering, especially for systems that have undergone several years of ad hoc maintenance. The problem is that, for such systems, the gap between the system implementation and its design models tend to be considerably large. Reverse engineering techniques, particularly the ones that focus on design and architecture recovery, aim to reduce this gap by recovering high-level design views from the source code. The course code becomes then the data on which these techniques operate. In this paper, we argue that the quality of a design and architecture recovery approach depends significantly on the ability to detect and eliminate the unwanted noise in the source code. We characterize this noise as being the system utility components that tend to encumber the system structure and hinder the ability to effectively recover adequate design views of the system. We support our argument by presenting various design and architecture recovery studies that have been shown to be successful because of their ability to filter out utility components. We also present existing automatic utility detection techniques along with the challenges that remain unaddressed.

[1]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[2]  Nicolas Anquetil,et al.  Assessing the relevance of identifier names in a legacy software system , 1998, CASCON.

[3]  Daniel Amyot,et al.  Recovering behavioral design models from execution traces , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[4]  Hausi A. Müller,et al.  Composing subsystem structures using (k,2)-partite graphs , 1990, Proceedings. Conference on Software Maintenance 1990.

[5]  Abdelwahab Hamou-Lhadj,et al.  Summarizing the Content of Large Traces to Facilitate the Understanding of the Behaviour of a Software System , 2006, 14th IEEE International Conference on Program Comprehension (ICPC'06).

[6]  Abdelwahab Hamou-Lhadj,et al.  Software Clustering Using Dynamic Analysis and Static Dependencies , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[7]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[8]  Janice Singer,et al.  How software engineers use documentation: the state of the practice , 2003, IEEE Software.

[9]  Vassilios Tzerpos,et al.  Software clustering based on omnipresent object detection , 2005, 13th International Workshop on Program Comprehension (IWPC'05).

[10]  Abdelwahab Hamou-Lhadj,et al.  An Approach for Mapping Features to Code Based on Static and Dynamic Analysis , 2008, 2008 16th IEEE International Conference on Program Comprehension.

[11]  Gregg Rothermel,et al.  Whole program path-based dynamic impact analysis , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[12]  Richard C. Holt,et al.  ACCD: an algorithm for comprehension-driven clustering , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[13]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).