Component clustering based on maximal association

Presents a supervised clustering framework for recovering the architecture of a software system. The technique measures the association between the system components (such as files) in terms of data and control flow dependencies among the groups of highly related entities that are scattered throughout the components. The application of data mining techniques allows us to extract the maximum association among the groups of entities. This association is used as a measure of closeness among the system files in order to collect them into subsystems using an optimization clustering technique. A two-phase supervised clustering process is applied to incrementally generate the clusters and control the quality of the system decomposition. In order to address the complexity, issues, the whole clustering space is decomposed into subspaces based on the association property. At each iteration, the subspaces are analyzed to determine the most eligible subspace for the next cluster, which is then followed by an optimization search to generate a new cluster.

[1]  Thomas W. Reps,et al.  Identifying Modules via Concept Analysis , 1999, IEEE Trans. Software Eng..

[2]  William C. Chu,et al.  A measure for composite module cohesion , 1992, International Conference on Software Engineering.

[3]  Richard C. Holt,et al.  ACCD: an algorithm for comprehension-driven clustering , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[4]  Richard C. Holt,et al.  Software botryology. Automatic clustering of software systems , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[5]  John Davey,et al.  Evaluating the suitability of data clustering for software remodularisation , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[6]  Kamran Sartipi,et al.  A graph pattern matching approach to software architecture recovery , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[7]  T. A. Wiggerts,et al.  Using clustering algorithms in legacy systems remodularization , 1997, Proceedings of the Fourth Working Conference on Reverse Engineering.

[8]  Gregor Snelting Software reengineering based on concept lattices , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[9]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[10]  Rainer Koschke An incremental semi-automatic method for component recovery , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[11]  Gregor Snelting,et al.  Assessing Modular Structure of Legacy Code Based on Mathematical Concept Analysis , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[12]  Kamran Sartipi Alborz: a query-based tool for software architecture recovery , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[13]  Roger S. Pressman,et al.  Software engineering (3rd ed.): a practitioner's approach , 1992 .

[14]  Farhad Mavaddat,et al.  A pattern matching framework for software architecture recovery and restructuring , 2000, Proceedings IWPC 2000. 8th International Workshop on Program Comprehension.

[15]  Victor R. Basili,et al.  System Structure Analysis: Clustering with Data Bindings , 1985, IEEE Transactions on Software Engineering.

[16]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[17]  James M. Bieman,et al.  Measuring Functional Cohesion , 1994, IEEE Trans. Software Eng..

[18]  Emden R. Gansner,et al.  Using automatic clustering to produce high-level system organizations of source code , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[19]  Kamran Sartipi A software evaluation model using component association views , 2001, Proceedings 9th International Workshop on Program Comprehension. IWPC 2001.

[20]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[21]  Thomas Kunz,et al.  Using Automatic Process Clustering for Design Recovery and Distributed Debugging , 1995, IEEE Trans. Software Eng..

[22]  Farhad Mavaddat,et al.  Architectural design recovery using data mining techniques , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[23]  Vojislav B. Misic Coherence equals cohesion-or does it? , 2000, Proceedings Seventh Asia-Pacific Software Engeering Conference. APSEC 2000.