Combining software quality predictive models: an evolutionary approach

During the last ten years, a large number of quality models have been proposed in the literature. In general, the goal of these models is to predict a quality factor starting from a set of direct measures. The lack of data behind these models makes it hard to generalize, cross-validate, and reuse existing models. As a consequence, for a company, selecting an appropriate quality model is a difficult, non-trivial decision. In this paper, we propose a general approach and a particular solution to this problem. The main idea is to combine and adapt existing models (experts) in such a way that the combined model works well on the particular system or in the particular type of organization. In our particular solution, the experts are assumed to be decision tree or rule-based classifiers and the combination is done by a genetic algorithm. The result is a white-box model: for each software component, not only does the model give a prediction of the software quality factor, it also provides the expert that was used to obtain the prediction. Test results indicate that the proposed model performs significantly better than individual experts in the pool.

[1]  Horst Zuse,et al.  A Framework of Software Measurement , 1998 .

[2]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[3]  Lionel C. Briand,et al.  Empirical Studies of Quality Models in Object-Oriented Systems , 2002, Adv. Comput..

[4]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[5]  Ran El-Yaniv,et al.  Localized Boosting , 2000, COLT.

[6]  Premkumar T. Devanbu,et al.  An Investigation into Coupling Measures for C++ , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[7]  Stephen G. MacDonell,et al.  A comparison of techniques for developing predictive models of software metrics , 1997, Inf. Softw. Technol..

[8]  Eddy Mayoraz,et al.  DynaBoost: Combining Boosted Hypotheses in a Dynamic Way , 1999 .

[9]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[10]  Norman E. Fenton,et al.  Software metrics: roadmap , 2000, ICSE '00.

[11]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[12]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  Lionel C. Briand,et al.  Exploring the relationships between design measures and software quality in object-oriented systems , 2000, J. Syst. Softw..

[16]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[17]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.