On the practice of semantic versioning for Ansible galaxy roles: An empirical study and a change classification model

Abstract Ansible, a popular Infrastructure-as-Code platform, provides reusable collections of tasks called roles. Roles are often contributed by third parties, and like general-purpose libraries, they evolve. Therefore, new releases of roles need to be tagged with version numbers, for which Ansible recommends adhering to the semantic versioning format. However, roles significantly differ from general-purpose libraries, and it is not yet known what constitutes a breaking change or the addition of a feature to a role. Consequently, this can cause confusion for clients of a role and new role contributors. To alleviate this issue, we perform an empirical study on semantic versioning in Ansible roles to uncover the types of changes that trigger certain types of version bumps. Our dataset consists of over 81 000 version increments spanning upwards of 8 500 Ansible roles. We design a novel structural model for these roles, and implement a domain-specific structural change extraction algorithm to calculate structural difference metrics. Afterwards, we quantitatively investigate the state of semantic versioning in Ansible roles and identify the most commonly changed elements. Then, using the structural difference metrics, we train a Random Forest classifier to predict applicable version bumps for Ansible role releases. Finally, we confirm our empirical findings with a developer survey. Our observations show that although most Ansible role developers follow the semantic versioning format, it appears that they do not always consistently follow the same rules when selecting the version bump to apply. Moreover, we find that the distinction between patch and minor increments is often unclear. Therefore, we use the gained insights to formulate a number of guidelines to apply semantic versioning on Ansible roles. These guidelines can be used by role developers to ensure a clear interpretation of the version increments.

[1]  Fabio Palomba,et al.  Within-Project Defect Prediction of Infrastructure-as-Code Using Product and Process Metrics , 2021, IEEE Transactions on Software Engineering.

[2]  Florian Rosenberg,et al.  Testing Idempotence for Infrastructure as Code , 2013, Middleware.

[3]  Kief Morris,et al.  Infrastructure as Code: Managing Servers in the Cloud , 2016 .

[4]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[5]  Arjun Guha,et al.  Rehearsal: a configuration verification tool for puppet , 2015, PLDI.

[6]  Chris Parnin,et al.  The Seven Sins: Security Smells in Infrastructure as Code Scripts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[7]  Georgios Gousios,et al.  How good is your puppet? An empirically defined and validated quality model for puppet , 2018, SANER.

[8]  Diomidis Spinellis,et al.  Does Your Configuration Code Smell? , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[9]  Laurie A. Williams,et al.  Source Code Properties of Defective Infrastructure as Code Scripts , 2018, Inf. Softw. Technol..

[10]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Coen De Roover,et al.  Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution , 2021, 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR).

[12]  David Lo,et al.  Characterizing and identifying reverted commits , 2019, Empirical Software Engineering.

[13]  Alexandre Decan,et al.  What Do Package Dependencies Tell Us About Semantic Versioning? , 2021, IEEE Transactions on Software Engineering.

[14]  James D. Herbsleb,et al.  How to break an API: cost negotiation and community values in three software ecosystems , 2016, SIGSOFT FSE.

[15]  Fabio Palomba,et al.  Towards a Catalogue of Software Quality Metrics for Infrastructure Code , 2020, J. Syst. Softw..

[16]  Eleni Constantinou,et al.  A formal framework for measuring technical lag in component repositories — and its application to npm , 2019, J. Softw. Evol. Process..

[17]  Steven Raemaekers,et al.  Semantic versioning and impact of breaking changes in the Maven repository , 2017, J. Syst. Softw..

[18]  Coen De Roover,et al.  Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution , 2020, 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[19]  Coen De Roover,et al.  Extracting executable transformations from distilled code changes , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[20]  Jorma Laurikkala,et al.  Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.

[21]  Martin Garriga,et al.  Adoption, Support, and Challenges of Infrastructure-as-Code: Insights from Industry , 2019, 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[22]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[23]  Grigorios Tsoumakas,et al.  On the Stratification of Multi-label Data , 2011, ECML/PKDD.

[24]  Kelly Blincoe,et al.  Dependency Versioning in the Wild , 2019, 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR).

[25]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[26]  Mark A. Neerincx,et al.  Contrastive Explanations with Local Foil Trees , 2018, ICML 2018.

[27]  Matias Martinez,et al.  Fine-grained and accurate source code differencing , 2014, ASE.

[28]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[29]  Laurie A. Williams,et al.  Where Are The Gaps? A Systematic Mapping Study of Infrastructure as Code Research , 2018, Inf. Softw. Technol..

[30]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[31]  Fuyuki Ishikawa,et al.  Test Suite Reduction in Idempotence Testing of Infrastructure as Code , 2017, TAP@STAF.