Empirical Software Engineering Research - The Good, The Bad, The Ugly

The Software Engineering Research community has slowly recognized that empirical studies are an important way of validating ideas and increasingly our community has stopped accepting the sufficiency of arguing that a smart person has come up with the idea and therefore it must be good. This has led to a flood of Software Engineering papers that contain at least some form of empirical study. However, not all empirical studies are created equal, and many may not even provide any useful information or value. We survey the gradual shift from essentially no empirical studies, to a small number of ones of questionable value, and look at what we need to do to insure that our empirical studies really contribute to the state of knowledge in the field. Thus we have the good, the bad, and the ugly. What are we as a community doing correctly? What are we doing less well than we should be because we either don't have the necessary artifacts or because the time and resources required to do "the good" is perceived to be too great? And where are we missing the boat entirely in terms of not addressing critical questions and often not even recognizing that these questions are central even if we don't know the answers. We look to see whether we can find some commonality in the projects that have really made the transition from research to widespread practice to see whether we can identify some common themes.

[1]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[2]  Marvin V. Zelkowitz,et al.  Lessons learned from 25 years of process improvement: the rise and fall of the NASA software engineering laboratory , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[3]  Elaine J. Weyuker,et al.  Collecting and categorizing software error data in an industrial environment , 2018, J. Syst. Softw..

[4]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[5]  P. Frankl,et al.  Assessing the fault-detecting ability of testing methods , 1991, SIGSOFT '91.

[6]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[7]  J WeyukerElaine,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008 .

[8]  Elaine J. Weyuker,et al.  Software testing research and software engineering education , 2010, FoSER '10.

[9]  Gregg Rothermel,et al.  Infrastructure support for controlled experimentation with software testing and regression testing techniques , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[10]  Elaine J. Weyuker,et al.  Comparing the effectiveness of several modeling methods for fault prediction , 2010, Empirical Software Engineering.

[11]  Lionel C. Briand,et al.  A systematic and comprehensive investigation of methods to build and evaluate fault prediction models , 2010, J. Syst. Softw..

[12]  Elaine J. Weyuker,et al.  Data flow analysis techniques for test data selection , 2015, ICSE '82.

[13]  Elaine J. Weyuker,et al.  A Formal Analysis of the Fault-Detecting Ability of Testing Methods , 1993, IEEE Trans. Software Eng..

[14]  Brendan Murphy,et al.  Can developer-module networks predict failures? , 2008, SIGSOFT '08/FSE-16.

[15]  Elaine J. Weyuker,et al.  Comparison of program testing strategies , 1991, TAV4.

[16]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[17]  Elaine J. Weyuker,et al.  Selecting Software Test Data Using Data Flow Information , 1985, IEEE Transactions on Software Engineering.

[18]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[19]  Giovanni Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, ICSE '02.