Linear Rule Based Ensemble Methods for the Prediction of Number of Faults

In the previous chapter, we explored the use of linear rule based ensemble methods for the number of faults prediction. In that work, we used four different ensemble methods, each of them combines the outputs of base learners in a linear form. Results of experimental analysis showed that a stable and accurate fault prediction performance could be achieved using linear rule based ensemble methods. However, these ensemble methods capture only the weighted contributions of base learners and combine them in linear way, which may sometimes suffers from the linearity error problem of fitting in a straight line (Fox in Regression diagnostics: an introduction. Sage, 1991 [1]).

[1]  Jerome H Friedman,et al.  Multiple additive regression trees with application in epidemiology , 2003, Statistics in medicine.

[2]  Taghi M. Khoshgoftaar,et al.  Empirical case studies of combining software quality classification models , 2003, Third International Conference on Quality Software, 2003. Proceedings..

[3]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[4]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[5]  Geoffrey I. Webb,et al.  Multistrategy ensemble learning: reducing error by combining ensemble learning techniques , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[7]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[8]  J. Fox,et al.  Regression Diagnostics: An Introduction , 1991 .

[9]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[10]  R. Tibshirani,et al.  Combining Estimates in Regression and Classification , 1996 .

[11]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[12]  Ian H. Witten,et al.  WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[13]  Xiaoyuan Jing,et al.  Multiple kernel ensemble learning for software defect prediction , 2015, Automated Software Engineering.

[14]  Venkata U. B. Challagulla,et al.  A Unified Framework for Defect Data Analysis Using the MBR Technique , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[15]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[16]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[17]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[18]  Charles Yang,et al.  Partition testing, stratified sampling, and cluster analysis , 1993, SIGSOFT '93.

[19]  Yue Jiang,et al.  Techniques for evaluating fault prediction models , 2008, Empirical Software Engineering.

[20]  Michael J. Pazzani,et al.  Classification and regression by combining models , 1998 .

[21]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[22]  Gavin Brown,et al.  Diversity in neural network ensembles , 2004 .

[23]  Qinbao Song,et al.  Using Coding-Based Ensemble Learning to Improve Software Defect Prediction , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[24]  Ullrich Köthe,et al.  Learning to count with regression forest and structured labels , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[25]  Saso Dzeroski,et al.  Combining Classifiers with Meta Decision Trees , 2003, Machine Learning.

[26]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[27]  Ayse Basar Bener,et al.  Defect prediction from static code features: current results, limitations, new approaches , 2010, Automated Software Engineering.

[28]  Mahmoud O. Elish Improved estimation of software project effort using multiple additive regression trees , 2009, Expert Syst. Appl..

[29]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[30]  Sandeep Kumar,et al.  Linear and non-linear heterogeneous ensemble methods to predict the number of faults in software systems , 2017, Knowl. Based Syst..