Software Fault Proneness Prediction Using Support Vector Machines

Empirical validation of software metrics to predict quality using machine learning methods is important to ensure their practical relevance in the software organizations. In this paper, we build a Support Vector Machine (SVM) model to find the relationship between object-oriented metrics given by Chidamber and Kemerer and fault proneness. The proposed model is empirically evaluated using public domain KC1 NASA data set. The performance of the SVM method was evaluated by Receiver Operating Characteristic (ROC) analysis. Based on these results, it is reasonable to claim that such models could help for planning and performing testing by focusing resources on fault-prone parts of the design and code. Thus, the study shows that SVM method may also be used in constructing software quality models.

[1]  Lionel C. Briand,et al.  Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs , 2001, Empirical Software Engineering.

[2]  Richard H. Carver,et al.  An Evaluation of the MOOD Set of Object-Oriented Software Metrics , 1998, IEEE Trans. Software Eng..

[3]  Joanne Bechta Dugan,et al.  Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods , 2007, IEEE Transactions on Software Engineering.

[4]  Mark Lorenz Object-Oriented Software Metrics , 1994 .

[5]  Michelle Cartwright,et al.  An Empirical Investigation of an Object-Oriented Software System , 2000, IEEE Trans. Software Eng..

[6]  N Sambasivarao Software reuse metrics for object oriented systems , 2007 .

[7]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[8]  Arvinder Kaur,et al.  Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study , 2009 .

[9]  Khaled El Emam,et al.  A Validation of Object-oriented Metrics , 1999 .

[10]  Lionel C. Briand,et al.  A Unified Framework for Cohesion Measurement in Object-Oriented Systems , 1997, Proceedings Fourth International Software Metrics Symposium.

[11]  David P. Darcy,et al.  Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis , 1998, IEEE Trans. Software Eng..

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[14]  Brian Henderson-Sellers,et al.  Object-oriented metrics: measures of complexity , 1995 .

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Yuming Zhou,et al.  Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults , 2006, IEEE Transactions on Software Engineering.

[17]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[18]  Lionel C. Briand,et al.  A Unified Framework for Coupling Measurement in Object-Oriented Systems , 1999, IEEE Trans. Software Eng..

[19]  Stephen R. Schach,et al.  Validation of the coupling dependency metric as a predictor of run-time failures and maintenance measures , 1998, Proceedings of the 20th International Conference on Software Engineering.

[20]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[21]  K. K. Aggarwal,et al.  Investigating effect of Design Metrics on Fault Proneness in Object-Oriented Systems , 2007, J. Object Technol..

[22]  David P. Tegarden,et al.  A software complexity model of object-oriented systems , 1995, Decis. Support Syst..

[23]  Noboru Takagi,et al.  An application of support vector machines to chinese character classification problem , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[24]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[25]  Curtis R. Cook,et al.  Use of Factor Analysis to Develop OOP Software Complexity Metrics , 1994 .

[26]  Taghi M. Khoshgoftaar,et al.  Application of neural networks to software quality modeling of a very large telecommunications system , 1997, IEEE Trans. Neural Networks.

[27]  Lionel C. Briand,et al.  Exploring the relationships between design measures and software quality in object-oriented systems , 2000, J. Syst. Softw..

[28]  K. K. Aggarwal,et al.  Empirical Study of Object-Oriented Metrics , 2006, J. Object Technol..

[29]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[30]  C. R. Kothari,et al.  Research Methodology: Methods and Techniques , 2009 .

[31]  Martin Hitz,et al.  Measuring coupling and cohesion in object-oriented systems , 1995 .

[32]  Sallie M. Henry,et al.  Object-oriented metrics that predict maintainability , 1993, J. Syst. Softw..

[33]  Lynne Boddy,et al.  Support vector machines for identifying organisms: a comparison with strongly partitioned radial basis function networks , 2001 .

[34]  Xue Wang,et al.  Fault Recognition with Labeled Multi-category Support Vector Machine , 2007, Third International Conference on Natural Computation (ICNC 2007).

[35]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[36]  Letha H. Etzkorn,et al.  Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes , 2007, IEEE Transactions on Software Engineering.

[37]  Adam A. Porter,et al.  Empirically guided software development using metric-based classification trees , 1990, IEEE Software.

[38]  Arvinder Kaur,et al.  Application of support vector machine to predict fault prone classes , 2009, SOEN.

[39]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..