Co-evolution of Infrastructure and Source Code - An Empirical Study

Infrastructure-as-code automates the process of configuring and setting up the environment (e.g., servers, VMs and databases) in which a software system will be tested and/or deployed, through textual specification files in a language like Puppet or Chef. Since the environment is instantiated automatically by the infrastructure languages' tools, no manual intervention is necessary apart from maintaining the infrastructure specification files. The amount of work involved with such maintenance, as well as the size and complexity of infrastructure specification files, have not yet been studied empirically. Through an empirical study of the version control system of 265 Open Stack projects, we find that infrastructure files are large and churn frequently, which could indicate a potential of introducing bugs. Furthermore, we found that the infrastructure code files are coupled tightly with the other files in a project, especially test files, which implies that testers often need to change infrastructure specifications when making changes to the test framework and tests.

[1]  Nachiappan Nagappan,et al.  The impact of test ownership and team structure on the reliability and effectiveness of quality test runs , 2014, ESEM '14.

[2]  Ahmed E. Hassan,et al.  An industrial study on the risk of software changes , 2012, SIGSOFT FSE.

[3]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[4]  Jez Humble,et al.  Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation , 2010 .

[5]  Audris Mockus,et al.  Organizational volatility and its effects on software defects , 2010, FSE '10.

[6]  Peter Kampstra,et al.  Beanplot: A Boxplot Alternative for Visual Comparison of Distributions , 2008 .

[7]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[8]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[9]  Michael W. Godfrey,et al.  Release Pattern Discovery: A Case Study of Database Systems , 2007, 2007 IEEE International Conference on Software Maintenance.

[10]  Christian Bird,et al.  Assessing the value of branches with what-if analysis , 2012, SIGSOFT FSE.

[11]  Bill Curtis,et al.  A field study of the software design process for large systems , 1988, CACM.

[12]  Shane McIntosh,et al.  An empirical study of build maintenance effort , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[13]  Wolfgang De Meuter,et al.  The Evolution of the Linux Build System , 2007, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[14]  Alberto Bacchelli,et al.  Expectations, outcomes, and challenges of modern code review , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[15]  Lerina Aversano,et al.  Learning from bug-introducing changes to prevent fault prone code , 2007, IWPSE '07.

[16]  Andy Zaidman,et al.  Studying Co-evolution of Production and Test Code Using Association Rule Mining , 2009 .

[17]  Patrick E. McKight,et al.  Kruskal-Wallis Test , 2010 .

[18]  Seth Vargo,et al.  Learning Chef: A Guide to Configuration Management and Automation , 2013 .

[19]  Marcelo Serrano Zanetti,et al.  The co-evolution of socio-technical structures in sustainable software development: Lessons from the open source software communities , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[20]  Matthew B. Miles,et al.  Qualitative Data Analysis: An Expanded Sourcebook , 1994 .

[21]  Foutse Khomh,et al.  A qualitative analysis of software build system changes and build ownership styles , 2014, ESEM '14.

[22]  Premkumar T. Devanbu,et al.  Ownership, experience and defects: a fine-grained study of authorship , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[23]  Bora Caglayan,et al.  The effect of evolutionary coupling on software defects: an industrial case study on a legacy system , 2014, ESEM '14.

[24]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[25]  Michael W. Godfrey,et al.  The MSR Cookbook: Mining a decade of research , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[26]  Nachiappan Nagappan,et al.  Empirically Detecting False Test Alarms Using Association Rules , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[27]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[28]  M. Engle Book Review: Qualitative Data Analysis: An Expanded Sourcebook (2nd Ed.) , 1999 .

[29]  Laurie A. Williams,et al.  Secure open source collaboration: an empirical study of linus' law , 2009, CCS.

[30]  Marlon Dumas,et al.  Code churn estimation using organisational and code metrics: An experimental comparison , 2012, Inf. Softw. Technol..

[31]  Andy Zaidman,et al.  Using association rules to study the co-evolution of production & test code , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[32]  Shane McIntosh,et al.  The evolution of Java build systems , 2012, Empirical Software Engineering.

[33]  Thomas Zimmermann,et al.  Automatic Identification of Bug-Introducing Changes , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[34]  Pierre N. Robillard,et al.  The role of knowledge in software development , 1999, CACM.