Putting It All Together: Using Socio-technical Networks to Predict Failures

Studies have shown that social factors in development organizations have adramatic effect on software quality. Separately, program dependencyinformation has also been used successfully to predict which software componentsare more fault prone. Interestingly, the influence of these two phenomena haveonly been studied separately. Intuition and practical experience suggests,however, that task assignment (i.e. who worked on which components and howmuch) and dependency structure (which components have dependencies on others)together interact to influence the quality of the resulting software. Westudy the influence of combined socio-technical software networks onthe fault-proneness of individual software components within a system. Thenetwork properties of a software component in this combined network are ableto predict if an entity is failure prone with greater accuracy than priormethods which use dependency or contribution information in isolation. Weevaluate our approach in different settings by using it on Windows Vista andacross six releases of the Eclipse development environment including usingmodels built from one release to predict failure prone components in the nextrelease. We compare this to previous work. In every case, our method performsas well or better and is able to more accurately identify those softwarecomponents that have more post-release failures, with precision and recallrates as high as 85%.

[1]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[2]  Stanley M. Sutton,et al.  N degrees of separation: multi-dimensional separation of concerns , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[3]  S. Dowdy,et al.  Statistics for Research: Dowdy/Statistics , 2005 .

[4]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[5]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[6]  Mary E. Helander,et al.  Using Software Repositories to Investigate Socio-technical Congruence in Development Projects , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[7]  Nachiappan Nagappan,et al.  Predicting Subsystem Failures using Dependency Graph Complexities , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[8]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[9]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[10]  Laurie A. Williams,et al.  Predicting failures with developer networks and social network analysis , 2008, SIGSOFT '08/FSE-16.

[11]  Britta Ruhnau,et al.  Eigenvector-centrality - a node-centrality? , 2000, Soc. Networks.

[12]  William A. Brenneman Statistics for Research (3rd ed.) , 2005 .

[13]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[14]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[15]  Amitabh Srivastava,et al.  Efficient Integration Testing using Dependency Analysis , 2005 .

[16]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[17]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[18]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[19]  S. Dowdy,et al.  Statistics for Research , 1983 .

[20]  Donald Eugene. Farrar,et al.  Multicollinearity in Regression Analysis; the Problem Revisited , 2011 .

[21]  Vladimir Batagelj,et al.  Centrality in Social Networks , 1993 .

[22]  Chintan Amrit,et al.  A Social Network perspective of Conway’s Law , 2011 .

[23]  Gregory Tassey,et al.  Prepared for what , 2007 .

[24]  James D. Herbsleb,et al.  Identification of coordination requirements: implications for the Design of collaboration and awareness tools , 2006, CSCW '06.

[25]  James D. Herbsleb,et al.  Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity , 2008, ESEM '08.

[26]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[27]  F. W. Lancaster,et al.  Information retrieval systems; characteristics, testing, and evaluation , 1968 .

[28]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.