Using Tri-Relation Networks for Effective Software Fault-Proneness Prediction

Software modules and developers are two core elements during the process of software development. Previous studies have shown that analyzing relations between 1) software modules; (2) developers; and (3) modules and developers, is critical to understand how they interact with each other, which ultimately affects software quality. Specifically, relations such as developer contribution relation, module dependency relation, and developer collaboration relation have been used independently or in pairs to build networks for software fault-proneness prediction. However, none of them investigate the joint effort of these three relations. Bearing this in mind, in this paper, we propose a tri-relation network, a weighted network that integrates developer contribution, module dependency, and developer collaboration relations to study their combined impact on software quality. Four network node centrality metrics are further derived from the proposed network to predict the fault-proneness of a given software module at the file level. Moreover, we have explored a mechanism to refine current networks in order to further improve the effectiveness of software fault-proneness prediction.

[1]  Abraham Bernstein,et al.  Improving defect prediction using temporal features and non linear models , 2007, IWPSE '07.

[2]  Bojan Cukic,et al.  Robust prediction of fault-proneness by random forests , 2004, 15th International Symposium on Software Reliability Engineering.

[3]  Michael F. Siok,et al.  Be more familiar with our enemies and pave the way forward: A review of the roles bugs played in software failures , 2017, J. Syst. Softw..

[4]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[5]  Gregory Tassey,et al.  Prepared for what , 2007 .

[6]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[7]  Yue Jiang,et al.  Fault Prediction using Early Lifecycle Data , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[8]  Jordan Ell,et al.  Identifying failure inducing developer pairs within developer networks , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[9]  Ahmed E. Hassan,et al.  Studying the impact of dependency network measures on software quality , 2010, 2010 IEEE International Conference on Software Maintenance.

[10]  Bhavani M. Thuraisingham,et al.  Effective Software Fault Localization Using an RBF Neural Network , 2012, IEEE Transactions on Reliability.

[11]  Swapna S. Gokhale,et al.  Static and dynamic distance metrics for feature-based code analysis , 2005, J. Syst. Softw..

[12]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[13]  Giovanni Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, ICSE '02.

[14]  Silvio Romero de Lemos Meira,et al.  A Constructive RBF Neural Network for Estimating the Probability of Defects in Software Modules , 2007, 2007 International Joint Conference on Neural Networks.

[15]  Anita Sarma,et al.  Tesseract: Interactive visual exploration of socio-technical relationships in software development , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[16]  Joseph Robert Horgan,et al.  Applying design metrics to predict fault-proneness: a case study on a large-scale software system , 2000, Softw. Pract. Exp..

[17]  Taghi M. Khoshgoftaar,et al.  An empirical study of predicting software faults with case-based reasoning , 2006, Software Quality Journal.

[18]  Taghi M. Khoshgoftaar,et al.  Software quality assessment using a multi-strategy classifier , 2014, Inf. Sci..

[19]  Tong-Seng Quah,et al.  Application of neural networks for software quality prediction using object-oriented metrics , 2005, J. Syst. Softw..

[20]  Tianrui Li,et al.  A Combined-Learning Based Framework for Improved Software Fault Prediction , 2017, Int. J. Comput. Intell. Syst..

[21]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[22]  Pierfrancesco Bellini,et al.  Comparing fault-proneness estimation models , 2005, 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05).

[23]  Jin Zhao,et al.  Applying statistical methodology to optimize and simplify software metric models with missing data , 2006, SAC.

[24]  Ulrik Brandes,et al.  Network Analysis: Methodological Foundations , 2010 .

[25]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[26]  Neeraj Kumar Goyal,et al.  Predicting Fault-prone Software Module Using Data Mining Technique and Fuzzy Logic , 2010 .

[27]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[28]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[29]  Harald C. Gall,et al.  Putting It All Together: Using Socio-technical Networks to Predict Failures , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[30]  C. Borror An Introduction to Statistical Methods and Data Analysis, 5th Ed. , 2002 .

[31]  Yue Jiang,et al.  Comparing design and code metrics for software quality prediction , 2008, PROMISE '08.

[32]  Karim O. Elish,et al.  Predicting defect-prone software modules using support vector machines , 2008, J. Syst. Softw..

[33]  Yihao Li,et al.  Applying Social Network Analysis to Software Fault-Proneness Prediction , 2017 .

[34]  Shou-Yu Lee,et al.  DRS: A Developer Risk Metric for Better Predicting Software Fault-Proneness , 2015, 2015 Second International Conference on Trustworthy Systems and Their Applications.

[35]  Braden Simpson Changeset based developer communication to detect software failures , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[36]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[37]  Ken-ichi Matsumoto,et al.  Accelerating cross-project knowledge collaboration using collaborative filtering and social networks , 2005, ACM SIGSOFT Softw. Eng. Notes.

[38]  Sandro Morasca,et al.  Defining and Validating Measures for Object-Based High-Level Design , 1999, IEEE Trans. Software Eng..

[39]  Banu Diri,et al.  Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm , 2011, Expert Syst. Appl..

[40]  James D. Herbsleb,et al.  Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity , 2008, ESEM '08.

[41]  Kevin Crowston,et al.  Social dynamics of free and open source team communications , 2006, OSS.

[42]  Shihai Wang,et al.  An Empirical Study for Software Fault-Proneness Prediction with Ensemble Learning Models on Imbalanced Data Sets , 2014, J. Softw..

[43]  Audris Mockus,et al.  Software Dependencies, Work Dependencies, and Their Impact on Failures , 2009, IEEE Transactions on Software Engineering.

[44]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[45]  W. Eric Wong,et al.  The DStar Method for Effective Software Fault Localization , 2014, IEEE Transactions on Reliability.

[46]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[47]  Valery Buzungu,et al.  Predicting Fault-prone Components in a Java Legacy System , 2006 .

[48]  Elaine J. Weyuker,et al.  Programmer-based fault prediction , 2010, PROMISE '10.

[49]  Dianxiang Xu,et al.  Towards Better Fault Localization: A Crosstab-Based Statistical Approach , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[50]  Swapna S. Gokhale,et al.  Metrics for quantifying the disparity, concentration, and dedication between program components and features , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[51]  Taghi M. Khoshgoftaar,et al.  Using neural networks to predict software faults during testing , 1996, IEEE Trans. Reliab..

[52]  Abraham Bernstein,et al.  Predicting defect densities in source code files with decision tree learners , 2006, MSR '06.

[53]  Taghi M. Khoshgoftaar,et al.  Early Quality Prediction: A Case Study in Telecommunications , 1996, IEEE Softw..

[54]  Rishab Aiyer Ghosh Clustering and dependencies in free/open source software development: Methodology and tools , 2003, First Monday.

[55]  Martin Pinzger,et al.  Method-level bug prediction , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.

[56]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[57]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[58]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[59]  Elaine J. Weyuker,et al.  Comparing the effectiveness of several modeling methods for fault prediction , 2010, Empirical Software Engineering.

[60]  Jin Xu,et al.  A Topological Analysis of the Open Souce Software Development Community , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[61]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[62]  Xin Yao,et al.  Using Class Imbalance Learning for Software Defect Prediction , 2013, IEEE Transactions on Reliability.

[63]  Laurie A. Williams,et al.  Predicting failures with developer networks and social network analysis , 2008, SIGSOFT '08/FSE-16.

[64]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[65]  Taghi M. Khoshgoftaar,et al.  Using regression trees to classify fault-prone software modules , 2002, IEEE Trans. Reliab..

[66]  Barry W. Boehm,et al.  A SLOC Counting Standard , 2007 .

[67]  Byoungju Choi,et al.  A family of code coverage-based heuristics for effective fault localization , 2010, J. Syst. Softw..

[68]  Yu Qi,et al.  Source code-based software risk assessing , 2005, SAC '05.

[69]  Ayse Basar Bener,et al.  Validation of network measures as indicators of defective modules in software systems , 2009, PROMISE '09.

[70]  Harald C. Gall,et al.  Don't touch my code!: examining the effects of ownership on software quality , 2011, ESEC/FSE '11.

[71]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[72]  Bojan Cukic,et al.  An adaptive approach with active learning in software fault prediction , 2012, PROMISE '12.

[73]  Iker Gondra,et al.  Applying machine learning to software fault-proneness prediction , 2008, J. Syst. Softw..

[74]  Bo Sun,et al.  Customization support for CBR-based defect prediction , 2011, Promise '11.