Predicting Defect Prone Modules in Web Applications

Predicting defect proneness of software products has been an active research area in software engineering domain in recent years. Researchers have been using static code metrics, code churn metrics, developer networks, and module networks as inputs to their proposed models until now. However, domain specific characteristics of software has not been taken into account. In this research, we propose to include a new set of metrics to improve defect prediction performance for web applications by utilizing their characteristics. To validate our hypotheses we used datasets from 3 open source web applications to conduct our experiments. Defect prediction is then performed using different machine learning algorithms. The results of experiments revealed that overall performance of defect predictors are improved compared to only using existing static code metrics. Therefore we recommend practitioners to utilise domain-specific characteristics in defect prediction as they can be informative.

[1]  Zhendong Su,et al.  Static detection of cross-site scripting vulnerabilities , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[2]  Arie van Deursen,et al.  Research Issues in the Automated Testing of Ajax Applications , 2010, SOFSEM.

[3]  Ayse Basar Bener,et al.  Defect prediction from static code features: current results, limitations, new approaches , 2010, Automated Software Engineering.

[4]  Sreedevi Sampath,et al.  Web application fault classification - an exploratory study , 2008, ESEM '08.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Paolo Tonella,et al.  Anomaly detection in Web applications: a review of already conducted case studies , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[7]  Taghi M. Khoshgoftaar,et al.  How Many Software Metrics Should be Selected for Defect Prediction? , 2011, FLAIRS.

[8]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[9]  Daniela E. Damian,et al.  Predicting build failures using social network analysis on developer communication , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[10]  Lwin Khin Shar,et al.  Mining input sanitization patterns for predicting SQL injection and cross site scripting vulnerabilities , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[11]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[12]  A. Jefferson Offutt,et al.  Applying Mutation Testing to Web Applications , 2010, 2010 Third International Conference on Software Testing, Verification, and Validation Workshops.

[13]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[14]  Paolo Tonella,et al.  Empirical Validation of a Web Fault Taxonomy and its usage for Fault Seeding , 2007, 2007 9th IEEE International Workshop on Web Site Evolution.

[15]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[16]  Barry W. Boehm,et al.  What we have learned about fighting defects , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[17]  Frank Tip,et al.  Finding bugs in dynamic web applications , 2008, ISSTA '08.

[18]  Marco Torchiano,et al.  Are web applications more defect-prone than desktop applications? , 2010, International Journal on Software Tools for Technology Transfer.

[19]  Sebastian G. Elbaum,et al.  Code churn: a measure for estimating the impact of code change , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[20]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[21]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[22]  Sanjay Misra,et al.  Estimating Quality of JavaScript , 2012, Int. Arab J. Inf. Technol..

[23]  Je Outt Quality Attributes of Web Software Applications , 2002 .

[24]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[25]  A. Jefferson Offutt,et al.  Quality Attributes of Web Software Applications , 2002, IEEE Softw..

[26]  Bora Caglayan,et al.  Defect prediction using social network analysis on issue repositories , 2011, ICSSP '11.

[27]  A. Jefferson Offutt,et al.  Testing Web applications by modeling with FSMs , 2005, Software & Systems Modeling.

[28]  Laurie A. Williams,et al.  Predicting failures with developer networks and social network analysis , 2008, SIGSOFT '08/FSE-16.

[29]  Lopo L. Rego,et al.  What makes commercial Web pages popular , 1998 .

[30]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[31]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[32]  Giridharan Vilangadu Vijayaraghavan,et al.  A TAXONOMY OF E-COMMERCE RISKS AND FAILURES , 2003 .

[33]  Lori Pollock,et al.  Strategies for automatically exposing faults in web applications , 2007 .

[34]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[35]  Miguel Correia,et al.  Automatic detection and correction of web application vulnerabilities using data mining to predict false positives , 2014, WWW.

[36]  Fahad A. Arshad,et al.  Failure characterization and error detection in distributed web applications , 2014 .