Condensing Class Diagrams With Minimal Manual Labeling Cost

Traditionally, to better understand the design of a project, developers can reconstruct a class diagram from source code using a reverse engineering technique. However, the raw diagram is often perplexing because there are too many classes in it. Condensing the reverse engineered class diagram into a compact class diagram which contains only the important classes would enhance the understandability of the corresponding project. A number of recent works have proposed several supervised machine learning solutions that can be used for condensing reverse engineered class diagrams given a set of classes that are manually labeled as important or not. However, a challenge impacts the practicality of the proposed solutions, which is the expensive cost for manual labeling of training samples. More training samples will lead to better performance, but means higher manual labeling cost. Too much manual labeling will make the problem pointless since the aim is to automatically identify important classes. In this paper, to bridge this research gap, we propose a novel approach MCCondenser which only requires a small amount of training data but can still achieve a reasonably good performance. MCCondenser firstly selects a small proportion of all data, which are the most representative, as training data in an unsupervised way using k-means clustering. Next, it uses ensemble learning to handle the class imbalance problem so that a suitable classifier can be constructed based on the limited training data. To evaluate the performance of MCCondenser, we use datasets from nine open source projects, i.e., ArgoUML, JavaClient, JGAP, JPMC, Mars, Maze, Neuroph, Wro4J and xUML, containing a total of 2640 classes. We compare MCCondenser with two baseline approaches proposed by Thung et al., both of which are state-of-the-art approaches aimed to reduce the manual labeling cost. The experimental results show that MCCondenser can achieve an average AUC score of 0.73, which improves those of the two baselines by nearly 20% and 10% respectively.

[1]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[2]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[3]  N. Cliff Ordinal methods for behavioral data analysis , 1996 .

[4]  Harald C. Gall,et al.  Comparing fine-grained source code changes and code churn for bug prediction , 2011, MSR '11.

[5]  Taghi M. Khoshgoftaar,et al.  Software Quality Analysis of Unlabeled Program Modules With Semisupervised Clustering , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[6]  Michel R. V. Chaudron,et al.  An Analysis of Machine Learning Algorithms for Condensing Reverse Engineered Class Diagrams , 2013, 2013 IEEE International Conference on Software Maintenance.

[7]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[8]  Jonathan I. Maletic,et al.  Measuring Class Importance in the Context of Design Evolution , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[9]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[10]  Danilo Caivano,et al.  Are Forward Designed or Reverse-Engineered UML diagrams more helpful for code maintenance?: A family of experiments , 2015, Inf. Softw. Technol..

[11]  David Lo,et al.  Active Semi-supervised Defect Categorization , 2015, 2015 IEEE 23rd International Conference on Program Comprehension.

[12]  Andy Zaidman,et al.  Journal of Software Maintenance and Evolution: Research and Practice Automatic Identification of Key Classes in a Software System Using Webmining Techniques , 2022 .

[13]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[14]  David Lo,et al.  Condensing class diagrams by analyzing design and network metrics using optimistic classification , 2014, ICPC 2014.

[15]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[16]  Yi Peng,et al.  Ensemble of Software Defect Predictors: an AHP-Based Evaluation Method , 2011, Int. J. Inf. Technol. Decis. Mak..

[17]  Isabel M. Ramos,et al.  Are forward designed or reverse-engineered UML diagrams more helpful for code maintenance?: a controlled experiment , 2013, EASE '13.

[18]  Taghi M. Khoshgoftaar,et al.  Unsupervised learning for expert-based software quality estimation , 2004, Eighth IEEE International Symposium on High Assurance Systems Engineering, 2004. Proceedings..

[19]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[20]  Elmar Jürgens,et al.  Using Network Analysis for Recommendation of Central Software Classes , 2012, 2012 19th Working Conference on Reverse Engineering.

[21]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[22]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[23]  Jeffrey C. Carver,et al.  Characterizing Software Architecture Changes: An Initial Study , 2007, ESEM 2007.

[24]  David Lo,et al.  ELBlocker: Predicting blocking bugs with ensemble imbalance learning , 2015, Inf. Softw. Technol..

[25]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[26]  Jun Zheng,et al.  Predicting software reliability with neural network ensembles , 2009, Expert Syst. Appl..

[27]  Daniele Romano,et al.  Using source code metrics to predict change-prone Java interfaces , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[28]  Taghi M. Khoshgoftaar,et al.  Balancing Misclassification Rates in Classification-Tree Models of Software Quality , 2004, Empirical Software Engineering.

[29]  David Lo,et al.  An Empirical Study of Classifier Combination for Cross-Project Defect Prediction , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[30]  Akito Monden,et al.  The Effects of Over and Under Sampling on Fault-prone Module Detection , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).