Operational data are missing, incorrect, and decontextualized

Big data is upon us and the data scientist is the hottest profession. In software engineering, analyzing software development data from social networks, issue trackers, or version control systems is proliferating.

[1]  Audris Mockus,et al.  A method to identify and correct problematic software activity data: exploiting capacity constraints and data redundancies , 2015, ESEC/SIGSOFT FSE.

[2]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[3]  Audris Mockus,et al.  Missing Data in Software Engineering , 2008, Guide to Advanced Empirical Software Engineering.

[4]  James H. Brown,et al.  A General Model for the Origin of Allometric Scaling Laws in Biology , 1997, Science.

[5]  Audris Mockus,et al.  High-impact defects: a study of breakage and surprise defects , 2011, ESEC/FSE '11.

[6]  Audris Mockus,et al.  Engineering big data solutions , 2014, FOSE.

[7]  Vasant Dhar,et al.  Data science and prediction , 2012, CACM.

[8]  Audris Mockus,et al.  Impact of Triage: A Study of Mozilla and Gnome , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[9]  Audris Mockus,et al.  Product assignment recommender , 2014, ICSE Companion.

[10]  Audris Mockus,et al.  Towards building a universal defect prediction model with rank transformed predictors , 2016, Empirical Software Engineering.

[11]  Premkumar T. Devanbu,et al.  The missing links: bugs and bug-fix commits , 2010, FSE '10.

[12]  Audris Mockus,et al.  Who Will Stay in the FLOSS Community? Modeling Participant’s Initial Behavior , 2015, IEEE Transactions on Software Engineering.

[13]  Audris Mockus,et al.  Software Support Tools and Experimental Work , 2006, Empirical Software Engineering Issues.