Belief & Evidence in Empirical Software Engineering

Empirical software engineering has produced a steady stream of evidence-based results concerning the factors that affect important outcomes such as cost, quality, and interval. However, programmers often also have strongly-held a priori opinions about these issues. These opinions are important, since developers are highlytrained professionals whose beliefs would doubtless affect their practice. As in evidence-based medicine, disseminating empirical findings to developers is a key step in ensuring that the findings impact practice. In this paper, we describe a case study, on the prior beliefs of developers at Microsoft, and the relationship of these beliefs to actual empirical data on the projects in which these developers work. Our findings are that a) programmers do indeed have very strong beliefs on certain topics b) their beliefs are primarily formed based on personal experience, rather than on findings in empirical research and c) beliefs can vary with each project, but do not necessarily correspond with actual evidence in that project. Our findings suggest that more effort should be taken to disseminate empirical findings to developers and that more in-depth study the interplay of belief and evidence in software practice is needed.

[1]  Jan Jürjens,et al.  Comparing Bug Finding Tools with Reviews and Tests , 2005, TestCom.

[2]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[3]  Thomas D. LaToza,et al.  Maintaining mental models: a study of developer work habits , 2006, ICSE.

[4]  R. Nickerson Confirmation Bias: A Ubiquitous Phenomenon in Many Guises , 1998 .

[5]  Daniela Cruzes,et al.  Recommendations to the Adoption of New Software Practices: A Case Study of Team Intention and Behavior in Three Software Companies , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[6]  Ritu Agarwal,et al.  A field study of the adoption of software process innovations by information systems professionals , 2000, IEEE Trans. Engineering Management.

[7]  Jean-Marc Jézéquel,et al.  Robustness and diagnosability of OO systems designed by contracts , 2001, Proceedings Seventh International Software Metrics Symposium.

[8]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[9]  Audris Mockus,et al.  An Empirical Study of Speed and Communication in Globally Distributed Software Development , 2003, IEEE Trans. Software Eng..

[10]  Peter H. Reingen,et al.  Social Ties and Word-of-Mouth Referral Behavior , 1987 .

[11]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[12]  Mauricio Finavaro Aniche,et al.  What Do the Asserts in a Unit Test Tell Us about Code Quality? A Study on Open Source and Industrial Projects , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[13]  Andrew Begel,et al.  Analyze this! 145 questions for data scientists in software engineering , 2013, ICSE.

[14]  J. Ioannidis Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. , 1998, JAMA.

[15]  Harald C. Gall,et al.  Don't touch my code!: examining the effects of ownership on software quality , 2011, ESEC/FSE '11.

[16]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[17]  Premkumar T. Devanbu,et al.  Ownership, experience and defects: a fine-grained study of authorship , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[18]  Robert DeLine,et al.  Information Needs in Collocated Software Development Teams , 2007, 29th International Conference on Software Engineering (ICSE'07).

[19]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[20]  Laurie A. Williams,et al.  Secure open source collaboration: an empirical study of linus' law , 2009, CCS.

[21]  Tim Menzies,et al.  Distributed development considered harmful? , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[22]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[23]  Ted Tenny,et al.  Program Readability: Procedures Versus Comments , 1988, IEEE Trans. Software Eng..

[24]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[25]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[26]  Nathaniel Rothman,et al.  Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. , 2004, Journal of the National Cancer Institute.

[27]  Charles A. Sutton,et al.  Learning natural coding conventions , 2014, SIGSOFT FSE.

[28]  David Janzen Software architecture improvement through test-driven development , 2005, OOPSLA '05.

[29]  David Lo,et al.  How practitioners perceive the relevance of software engineering research , 2015, ESEC/SIGSOFT FSE.

[30]  Charles L. A. Clarke,et al.  Archetypal source code searches: a survey of software developers and maintainers , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[31]  Austen Rainer,et al.  Persuading developers to "buy into" software process improvement: a local opinion and empirical evidence , 2003, 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings..

[32]  Rajesh Krishna Balan,et al.  Globally distributed software development project performance: an empirical analysis , 2008, ISEC '08.

[33]  Stefan Hanenberg,et al.  An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time , 2010, OOPSLA.

[34]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[35]  Premkumar T. Devanbu,et al.  A large scale study of programming languages and code quality in github , 2014, SIGSOFT FSE.

[36]  Fred D. Davis,et al.  Investigating Determinants of Software Developers' Intentions to Follow Methodologies , 2003, J. Manag. Inf. Syst..

[37]  Chris F. Kemerer,et al.  Software complexity and software maintenance: A survey of empirical research , 1995, Ann. Softw. Eng..

[38]  James D. Herbsleb,et al.  Configuring global software teams: a multi-company analysis of project productivity, quality, and profits , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[39]  Guido Hertel,et al.  Motivation of software developers in Open Source projects: an Internet-based survey of contributors to the Linux kernel , 2003 .

[40]  Jessica Utts,et al.  What Educated Citizens Should Know About Statistics and Probability , 2003 .

[41]  L. Chan,et al.  The adoption of new technology: the case of object-oriented computing in software companies , 2000, IEEE Trans. Engineering Management.

[42]  Ahmed E. Hassan,et al.  An industrial study on the risk of software changes , 2012, SIGSOFT FSE.

[43]  Tore Dybå,et al.  Evidence-Based Software Engineering for Practitioners , 2005, IEEE Softw..

[44]  Harald C. Gall,et al.  Does distributed development affect software quality? An empirical case study of Windows Vista , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[45]  Premkumar T. Devanbu,et al.  Recalling the "imprecision" of cross-project defect prediction , 2012, SIGSOFT FSE.

[46]  Premkumar T. Devanbu,et al.  Comparing static bug finders and statistical prediction , 2014, ICSE.

[47]  Harvey P. Siy,et al.  An experiment to assess the cost-benefits of code inspections in large scale software development , 1995, SIGSOFT '95.

[48]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[49]  Leon Moonen,et al.  Evaluating the relation between coding standard violations and faultswithin and across software versions , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[50]  Tore Dybå,et al.  Evidence-based software engineering , 2004, Proceedings. 26th International Conference on Software Engineering.

[51]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[52]  Daniel Reisberg,et al.  Remembering emotional events , 1992, Memory & cognition.

[53]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[54]  Laurie A. Williams,et al.  On the value of static analysis for fault detection in software , 2006, IEEE Transactions on Software Engineering.

[55]  H. Hricak,et al.  Evidence-based medicine. , 1997, Singapore medical journal.

[56]  Premkumar T. Devanbu,et al.  Will They Like This? Evaluating Code Contributions with Language Models , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[57]  Magne Jørgensen,et al.  Believing is Seeing: Confirmation Bias Studies in Software Engineering , 2015, 2015 41st Euromicro Conference on Software Engineering and Advanced Applications.

[58]  R. Kelly Rainer,et al.  Factors that Impact Implementing a System Development Methodology , 1998, IEEE Trans. Software Eng..

[59]  Premkumar T. Devanbu,et al.  To what extent could we detect field defects? an empirical study of false negatives in static bug finding tools , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.