Automated clustering to support the reflexion method

A significant aspect in applying the Reflexion Method is the mapping of components found in the source code onto the conceptual components defined in the hypothesized architecture. To date, this mapping is established manually, which requires a lot of work for large software systems. In this paper, we present a new approach, in which clustering techniques are applied to support the user in the mapping activity. The result is a semi-automated mapping technique that accommodates the automatic clustering of the source model with the user's hypothesized knowledge about the system's architecture. This paper describes three case studies in which the semi-automated mapping technique, called HuGMe, has been applied successfully to extend a partial map of real-world software applications. In addition, the results of another case study from an earlier publication are summarized, which lead to comparable results. We evaluated the extended versions of two automatic software clustering techniques, namely, MQAttract and CountAttract, with oracle mappings. We closely study the influence of the degree of completeness of the existing mapping and other controlling variables of the technique to make reliable suggestions. Both clustering techniques were able to achieve a mapping quality where more than 90% of the automatic mapping decisions turned out to be correct. Moreover, the experiments indicate that the attraction function (CountAttract') based on local coupling and cohesion is more suitable for semi-automated mapping than the approach MQAttract' based on a global assessment of coupling and cohesion.

[1]  Rainer Koschke,et al.  Equipping the reflexion method with automated clustering , 2005, 12th Working Conference on Reverse Engineering (WCRE'05).

[2]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[3]  Theodore Johnson,et al.  A new approach to finding objects in programs , 1994, J. Softw. Maintenance Res. Pract..

[4]  Mircea Trifu,et al.  Architecture-aware adaptive clustering of OO systems , 2004, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings..

[5]  Mark Harman,et al.  A multiple hill climbing approach to software module clustering , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[6]  Aniello Cimitile,et al.  A case study of applying an eclectic approach to identify objects in code , 1999, Proceedings Seventh International Workshop on Program Comprehension.

[7]  Kamran Sartipi,et al.  A graph pattern matching approach to software architecture recovery , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[8]  Kamran Sartipi Alborz: a query-based tool for software architecture recovery , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[9]  Ali Shokoufandeh,et al.  Applying spectral methods to software clustering , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[10]  Jean-Francois Girard,et al.  Finding components in a hierarchy of modules: a step towards architectural understanding , 1997, 1997 Proceedings International Conference on Software Maintenance.

[11]  Robert W. Schwanke,et al.  Using Neural Networks to Modularize Software , 1994, Machine Learning.

[12]  Doris L. Carver,et al.  A graph-based object identification process for procedural programs , 1998, Proceedings Fifth Working Conference on Reverse Engineering (Cat. No.98TB100261).

[13]  Doris L. Carver,et al.  A visual representation model for software subsystem decomposition , 1998, Proceedings Fifth Working Conference on Reverse Engineering (Cat. No.98TB100261).

[14]  Onaiza Maqbool,et al.  The weighted combined algorithm: a linkage algorithm for software clustering , 2004, Eighth European Conference on Software Maintenance and Reengineering, 2004. CSMR 2004. Proceedings..

[15]  Rainer Koschke,et al.  Atomic architectural component recovery for program understanding and evolution , 2002, International Conference on Software Maintenance, 2002. Proceedings..

[16]  N. Wilde,et al.  Identifying objects in a conventional procedural language: an example of data design recovery , 1990, Proceedings. Conference on Software Maintenance 1990.

[17]  William C. Chu,et al.  A measure for composite module cohesion , 1992, International Conference on Software Engineering.

[18]  Dusan M. Velasevic,et al.  A use-case driven method of architecture recovery for program understanding and reuse reengineering , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[19]  Fernando Brito e Abreu,et al.  A coupling-guided cluster analysis approach to reengineer the modularity of object-oriented systems , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[20]  Margaret-Anne D. Storey,et al.  A multi-perspective software visualization environment , 2000, CASCON.

[21]  D. R. Harris,et al.  Recovering abstract data types and object instances from a conventional procedural language , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[22]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[23]  Paolo Tonella,et al.  Concept Analysis for Module Restructuring , 2001, IEEE Trans. Software Eng..

[24]  Song C. Choi,et al.  Extracting and restructuring the design of large systems , 1990, IEEE Software.

[25]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[26]  Periklis Andritsos,et al.  Software clustering based on information loss minimization , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[27]  Norman Wilde,et al.  An object finder for program structure understanding in software maintenance , 1994, J. Softw. Maintenance Res. Pract..

[28]  David Notkin,et al.  Reengineering with Reflection Models: A Case Study , 1997, Computer.

[29]  Houari A. Sahraoui,et al.  Applying concept formation methods to object identification in procedural code , 1997, Proceedings 12th IEEE International Conference Automated Software Engineering.

[30]  Jean-Francois Girard,et al.  A Metric-Based Approach to Detect Abstract Data Types and State Encapsulations , 2004, Automated Software Engineering.

[31]  Hausi A. Müller,et al.  A reverse engineering environment based on spatial and visual software interconnection models , 1992 .

[32]  Thomas W. Reps,et al.  Identifying Modules via Concept Analysis , 1999, IEEE Trans. Software Eng..

[33]  Harald C. Gall,et al.  Finding objects in procedural programs: an alternative approach , 1995, Proceedings of 2nd Working Conference on Reverse Engineering.

[34]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[35]  Harald C. Gall,et al.  Binding object models to source code: an approach to object-oriented re-architecting , 1998, Proceedings. The Twenty-Second Annual International Computer Software and Applications Conference (Compsac '98) (Cat. No.98CB 36241).

[36]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[37]  Richard C. Holt,et al.  The Orphan Adoption problem in architecture maintenance , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[38]  Rainer Koschke,et al.  Hierarchical reflexion models , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[39]  Derek Rayside,et al.  The effect of call graph construction algorithms for object-oriented programs on automatic clustering , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[40]  Giuseppe Visaggio,et al.  Software salvaging and the call dominance tree , 1995, J. Syst. Softw..

[41]  Gregor Snelting,et al.  Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[42]  Nicolas Anquetil,et al.  Extracting concepts from file names; a new file clustering criterion , 1998, Proceedings of the 20th International Conference on Software Engineering.