Mixed-Integer Linear Programming Formulations for the Software Clustering Problem

The clustering problem has an important application in software engineering, which usually deals with large software systems with complex structures. To facilitate the work of software maintainers, components of the system are divided into groups in such a way that the groups formed contain highly-interdependent modules and the independent modules are placed in different groups. The measure used to analyze the quality of the system partition is called Modularization Quality (MQ). Designers represent the software system as a graph where modules are represented by nodes and relationships between modules are represented by edges. This graph is referred in the literature as Module Dependency Graph (MDG). The Software Clustering Problem (SCP) consists in finding the partition of the MDG that maximizes the MQ.In this paper we present three new mathematical programming formulations for the SCP. Firstly, we formulate the SCP as a sum of linear fractional functions problem and then we apply two different linearization procedures to reformulate the problem as Mixed-Integer Linear Programming (MILP) problems. We discuss a preprocessing technique that reduces the size of the original problem and develop valid inequalities that have been shown to be very effective in tightening the formulations. We present numerical results that compare the formulations proposed and compare our results with the solutions obtained by the exhaustive algorithm supported by the freely available Bunch clustering tool, for benchmark problems.

[1]  Paolo Toth,et al.  The Vehicle Routing Problem , 2002, SIAM monographs on discrete mathematics and applications.

[2]  Bruce L. Golden,et al.  The vehicle routing problem : latest advances and new challenges , 2008 .

[3]  Jirí Síma,et al.  On the NP-Completeness of Some Graph Cluster Measures , 2005, SOFSEM.

[4]  Xiaodong Wu,et al.  Efficient Algorithms and Implementations for Optimizing the Sum of Linear Fractional Functions, with Applications , 2005, J. Comb. Optim..

[5]  A. Billionnet,et al.  Résolution d'un problème combinatoire fractionnaire par la programmation linéaire mixte , 2006 .

[6]  Saeed Parsa,et al.  A New Encoding Scheme and a Framework to Investigate Genetic Clustering Algorithms , 2005, J. Res. Pract. Inf. Technol..

[7]  François Margot,et al.  Symmetry in Integer Linear Programming , 2010, 50 Years of Integer Programming.

[8]  Emden R. Gansner,et al.  Bunch: a clustering tool for the recovery and maintenance of software system structures , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[9]  Han-Lin Li A GLOBAL APPROACH FOR GENERAL 0-1 FRACTIONAL-PROGRAMMING , 1994 .

[10]  Leandro C. Coelho,et al.  The Vehicle Routing Problem with Pauses , 2014 .

[11]  Alain Billionnet,et al.  Solution of a fractional combinatorial optimization problem by mixed integer programming , 2006, RAIRO Oper. Res..

[12]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[13]  Xin Yao,et al.  Software Module Clustering as a Multi-Objective Search Problem , 2011, IEEE Transactions on Software Engineering.

[14]  Brian S. Mitchell,et al.  A heuristic approach to solving the software clustering problem , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[15]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[16]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[17]  Spiros Mancoridis,et al.  Automatic clustering of software systems using a genetic algorithm , 1999, STEP '99. Proceedings Ninth International Workshop Software Technology and Engineering Practice.

[18]  Ali Safari Mamaghani,et al.  Clustering of Software Systems Using New Hybrid Algorithms , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[19]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .