Studying the Impact of Social Structures on Software Quality

Correcting software defects accounts for a significant amount of resources such as time, money and personnel. To be able to focus testing efforts where needed the most, researchers have studied statistical models to predict in which parts of a software future defects are likely to occur. By studying the mathematical relations between predictor variables used in these models, researchers can form an increased understanding of the important connections between development activities and software quality. Predictor variables used in past top-performing models are largely based on file-oriented measures, such as source code and churn metrics. However, source code is the end product of numerous interlaced and collaborative activities carried out by developers. Traces of such activities can be found in the repositories used to manage development efforts. In this paper, we investigate statistical models, to study the impact of social structures between developers and end-users on software quality. These models use predictor variables based on social information mined from the issue tracking and version control repositories of a large open-source software project. The results of our case study are promising and indicate that statistical models based on social information have a similar degree of explanatory power as traditional models. Furthermore, our findings suggest that social information does not substitute, but rather augments traditional product and process-based metrics used in defect prediction models.

[1]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[2]  Philip J. Guo,et al.  Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[3]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[4]  M. Friendly Corrgrams , 2002 .

[5]  Giulio Sandini,et al.  Proceedings of the Fourth International Workshop on Epigenetic Robotics , 2004 .

[6]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[7]  Daniela E. Damian,et al.  Predicting build failures using social network analysis on developer communication , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[8]  Andreas Zeller Why Programs Fail , 2005 .

[9]  Charles F. Hockett,et al.  A mathematical theory of communication , 1948, MOCO.

[10]  Nachiappan Nagappan,et al.  Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[11]  J. Neter,et al.  Applied Linear Regression Models , 1983 .

[12]  James E. Alatis Language, communication, and social meaning , 1993 .

[13]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[14]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[15]  Andreas Zeller,et al.  Predicting component failures at design time , 2006, ISESE '06.

[16]  Claire D’Este,et al.  Sharing Meaning with Machines , 2004 .

[17]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[18]  V. Malheiros,et al.  A Visual Text Mining approach for Systematic Reviews , 2007, ESEM 2007.

[19]  Audris Mockus,et al.  Test coverage and post-verification defects: A multiple case study , 2009, ESEM 2009.

[20]  Andreas Zeller,et al.  Why Programs Fail, Second Edition: A Guide to Systematic Debugging , 2009 .

[21]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[22]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[23]  Audris Mockus,et al.  Software Dependencies, Work Dependencies, and Their Impact on Failures , 2009, IEEE Transactions on Software Engineering.

[24]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[25]  Harald C. Gall,et al.  Analyzing and relating bug report data for feature tracking , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[26]  D G Altman,et al.  Statistics notes: Transformations, means, and confidence intervals , 1996, BMJ.

[27]  A. W. Edwards The Measure of Association in a 2 × 2 Table , 1963 .

[28]  Audris Mockus,et al.  International Workshop on Mining Software Repositories , 2004 .

[29]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[30]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[31]  Audris Mockus,et al.  Predictors of customer perceived software quality , 2005, ICSE.

[32]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[33]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[34]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[35]  Alberto Bacchelli,et al.  Are Popular Classes More Defect Prone? , 2010, FASE.

[36]  Thomas Zimmermann,et al.  Extracting structural information from bug reports , 2008, MSR '08.

[37]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[38]  Sebastian G. Elbaum,et al.  Code churn: a measure for estimating the impact of code change , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[39]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[40]  Nachiappan Nagappan,et al.  Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study , 2007, ESEM 2007.