Adaptive software fault prediction approach using object-oriented metrics

OF DISSERTATION ADAPTIVE SOFTWARE FAULT PREDICTION APPROACH USING OBJECT-ORIENTED METRICS by Djuradj Babic Florida International University, 2012 Miami, Florida Professor Naphtali Rishe, Major Professor As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.

[1]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[2]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[3]  Elaine J. Weyuker,et al.  Where the bugs are , 2004, ISSTA '04.

[4]  Forrest Shull,et al.  Building Knowledge through Families of Experiments , 1999, IEEE Trans. Software Eng..

[5]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[6]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[7]  Tariq M. King,et al.  Intra-Class Testing of Abstract Class Features , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[8]  Peter J. Clarke,et al.  A class abstraction technique to support the analysis of Java programs during testing , 2005, Third ACIS Int'l Conference on Software Engineering Research, Management and Applications (SERA'05).

[9]  Tariq M. King,et al.  A Prediction Model for the Combination of Class Characteristics in Large OO Applications , 2006 .

[10]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[11]  Fernando Brito e Abreu,et al.  A coupling-guided cluster analysis approach to reengineer the modularity of object-oriented systems , 2000, Proceedings of the Fourth European Conference on Software Maintenance and Reengineering.

[12]  Tibor Gyimóthy,et al.  Adding Process Metrics to Enhance Modification Complexity Prediction , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[13]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[14]  Salah Bouktif,et al.  Improving Rule Set Based Software Quality Prediction: A Genetic Algorithm-based Approach , 2004, J. Object Technol..

[15]  Andreas Zeller,et al.  Why Programs Fail: A Guide to Systematic Debugging , 2005 .

[16]  Tariq M. King,et al.  A testing strategy for abstract classes , 2012, Softw. Test. Verification Reliab..

[17]  Lionel C. Briand,et al.  Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects , 2002, IEEE Trans. Software Eng..

[18]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[19]  Michael English,et al.  Fault detection and prediction in an open-source software project , 2009, PROMISE '09.

[20]  Yuming Zhou,et al.  Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults , 2006, IEEE Transactions on Software Engineering.

[21]  P. Allison Multiple Regression: A Primer , 1994 .

[22]  Mary Jean Harrold,et al.  Testing: a roadmap , 2000, ICSE '00.

[23]  R. Iman,et al.  Rank Transformations as a Bridge between Parametric and Nonparametric Statistics , 1981 .

[24]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[25]  Lucas Layman,et al.  Iterative identification of fault-prone binaries using in-process metrics , 2008, ESEM '08.

[26]  Arvinder Kaur,et al.  Application of support vector machine to predict fault prone classes , 2009, SOEN.

[27]  Hausi A. Müller,et al.  Predicting fault-proneness using OO metrics. An industrial case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[28]  W. Greene Sample Selection Bias as a Specification Error: Comment , 1981 .

[29]  B. M. Golam Kibria,et al.  On some test statistics for testing homogeneity of variances: a comparative study , 2013 .

[30]  Norman E. Fenton,et al.  Software metrics: roadmap , 2000, ICSE '00.

[31]  Hongfang Liu,et al.  Identifying and characterizing change-prone classes in two large-scale open-source products , 2007, J. Syst. Softw..

[32]  Alan MacCormack,et al.  Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code , 2006, Manag. Sci..

[33]  Will G. Hopkins,et al.  A new view of statistics , 2002 .

[34]  Bojan Cukic,et al.  Robust prediction of fault-proneness by random forests , 2004, 15th International Symposium on Software Reliability Engineering.

[35]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[36]  J. Heckman Sample selection bias as a specification error , 1979 .

[37]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[38]  Fernando Brito e Abreu,et al.  Evaluating the impact of object-oriented design on software quality , 1996, Proceedings of the 3rd International Software Metrics Symposium.

[39]  R. A. Groeneveld,et al.  Practical Nonparametric Statistics (2nd ed). , 1981 .

[40]  C. Borror Practical Nonparametric Statistics, 3rd Ed. , 2001 .

[41]  H. Levene Robust tests for equality of variances , 1961 .

[42]  Letha H. Etzkorn,et al.  Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes , 2007, IEEE Transactions on Software Engineering.

[43]  Manfred Broy,et al.  Demystifying maintainability , 2006, WoSQ '06.

[44]  Jiju Antony,et al.  Confluence of six sigma, simulation and software development , 2005 .

[45]  Lucas Layman,et al.  Exploring extreme programming in context: an industrial case study , 2004, Agile Development Conference.

[46]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[47]  Danny Ho,et al.  An Empirical Validation of Object-Oriented Design Metrics for Fault Prediction , 2008 .

[48]  Mei-Hwa Chen,et al.  An empirical study on object-oriented metrics , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[49]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[50]  Peter J. Clarke,et al.  Using a class abstraction technique to predict faults in OO classes: a case study through six releases of the Eclipse JDT , 2011, SAC '11.

[51]  J. A. Calvin Regression Models for Categorical and Limited Dependent Variables , 1998 .

[52]  Peter J. Clarke,et al.  A Tool to Automatically Map Implementation-based Testing Techniques to Classes , 2006, Int. J. Softw. Eng. Knowl. Eng..

[53]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[54]  Letha H. Etzkorn,et al.  A comparison of cohesion metrics for object-oriented systems , 2004, Inf. Softw. Technol..

[55]  Elaine J. Weyuker,et al.  Automating algorithms for the identification of fault-prone files , 2007, ISSTA '07.

[56]  Fernando Brito e Abreu,et al.  Candidate metrics for object-oriented software within a taxonomy framework , 1994, J. Syst. Softw..

[57]  Tariq M. King,et al.  Analyzing clusters of class characteristics in OO applications , 2008, J. Syst. Softw..

[58]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[59]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[60]  Michael Stuart,et al.  Understanding Robust and Exploratory Data Analysis , 1984 .

[61]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[62]  Brian Berliner,et al.  CVS II: Parallelizing Software Dev elopment , 1998 .

[63]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[64]  Koichiro Ochimizu,et al.  Towards logistic regression models for predicting fault-prone code across software projects , 2009, ESEM 2009.

[65]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[66]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.