The do's and don'ts of infrastructure code: A systematic gray literature review

Abstract Context: Infrastructure-as-code (IaC) is the DevOps tactic of managing and provisioning software infrastructures through machine-readable definition files, rather than manual hardware configuration or interactive configuration tools. Objective: From a maintenance and evolution perspective, the topic has picked the interest of practitioners and academics alike, given the relative scarcity of supporting patterns and practices in the academic literature. At the same time, a considerable amount of gray literature exists on IaC. Thus we aim to characterize IaC and compile a catalog of best and bad practices for widely used IaC languages, all using gray literature materials. Method: In this paper, we systematically analyze the industrial gray literature on IaC, such as blog posts, tutorials, white papers using qualitative analysis techniques. Results: We proposed a definition for IaC and distilled a broad catalog summarized in a taxonomy consisting of 10 and 4 primary categories for best practices and bad practices, respectively, both language-agnostic and language-specific ones, for three IaC languages, namely Ansible, Puppet, and Chef. The practices reflect implementation issues, design issues, and the violation of/adherence to the essential principles of IaC. Conclusion: Our findings reveal critical insights concerning the top languages as well as the best practices adopted by practitioners to address (some of) those challenges. We evidence that the field of development and maintenance IaC is in its infancy and deserves further attention.

[1]  Matthias Marschall Chef Infrastructure Automation Cookbook , 2013 .

[2]  Fabio Palomba,et al.  Within-Project Defect Prediction of Infrastructure-as-Code Using Product and Process Metrics , 2021, IEEE Transactions on Software Engineering.

[3]  Claes Wohlin,et al.  Guidelines for snowballing in systematic literature studies and a replication in software engineering , 2014, EASE '14.

[4]  Gerti Kappel,et al.  A Systematic Review of Cloud Modeling Languages , 2018, ACM Comput. Surv..

[5]  Muhammad Ali Babar,et al.  A Multi-Vocal Review of Security Orchestration , 2019, ACM Comput. Surv..

[6]  Frank Leymann,et al.  TOSCA Lightning: An Integrated Toolchain for Transforming TOSCA Light into Production-Ready Deployment Technologies , 2020, CAiSE Forum.

[7]  Damian A. Tamburri,et al.  Blockchains , 2019, ACM Comput. Surv..

[8]  Georgios Meditskos,et al.  Towards Semantic Detection of Smells in Cloud Infrastructure Code , 2020, WIMS.

[9]  Laurie A. Williams,et al.  What Questions Do Programmers Ask about Configuration as Code? , 2018, 2018 IEEE/ACM 4th International Workshop on Rapid Continuous Software Engineering (RCoSE).

[10]  Georgios Gousios,et al.  How good is your puppet? An empirically defined and validated quality model for puppet , 2018, SANER.

[11]  Solon Barocas,et al.  Ten simple rules for responsible big data research , 2017, PLoS Comput. Biol..

[12]  Liming Zhu,et al.  DevOps - A Software Architect's Perspective , 2015, SEI series in software engineering.

[13]  Martin Garriga,et al.  Adoption, Support, and Challenges of Infrastructure-as-Code: Insights from Industry , 2019, 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[14]  Laurie Williams,et al.  The ‘as code’ activities: development anti-patterns for infrastructure as code , 2020, Empirical Software Engineering.

[15]  Andrew Blyth,et al.  Secure coding — principles and practices , 2004 .

[16]  Penelope Phipps,et al.  Adoption , 1953, Mental health.

[17]  Diomidis Spinellis,et al.  Practical Fault Detection in Puppet Programs , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[18]  Diomidis Spinellis,et al.  Does Your Configuration Code Smell? , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[19]  Gerardo Canfora,et al.  An empirical characterization of bad practices in continuous integration , 2020, Empirical Software Engineering.

[20]  Laurie A. Williams,et al.  Where Are The Gaps? A Systematic Mapping Study of Infrastructure as Code Research , 2018, Inf. Softw. Technol..

[21]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[22]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[23]  Vahid Garousi,et al.  Smells in software test code: A survey of knowledge in industry and academia , 2018, J. Syst. Softw..

[24]  Willem-Jan van den Heuvel,et al.  DeepIaC: deep learning-based linguistic anti-pattern detection in IaC , 2020, ArXiv.

[25]  Fabio Kon,et al.  A Survey of DevOps Concepts and Challenges , 2020, ACM Comput. Surv..

[26]  Johnny Saldaña,et al.  The Coding Manual for Qualitative Researchers , 2009 .

[27]  Elisabetta Di Nitto,et al.  DevOps: Introducing Infrastructure-as-Code , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[28]  Osamu Mizuno,et al.  An Empirical Study of Utilization of Imperative Modules in Ansible , 2020, 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS).

[29]  Elisabetta Di Nitto,et al.  Model-driven continuous deployment for quality DevOps , 2016, QUDOS@ISSTA.

[30]  Chris Parnin,et al.  Gang of Eight: A Defect Taxonomy for Infrastructure as Code Scripts , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[31]  Vahid Garousi,et al.  Guidelines for including the grey literature and conducting multivocal literature reviews in software engineering , 2017, Inf. Softw. Technol..

[32]  Md. Rayhanur Rahman,et al.  Security Smells in Ansible and Chef Scripts , 2019, ACM Trans. Softw. Eng. Methodol..

[33]  Chris Parnin,et al.  The Seven Sins: Security Smells in Infrastructure as Code Scripts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[34]  Sai Zeng,et al.  Automatically detecting risky scripts in infrastructure code , 2020, SoCC.

[35]  Coen De Roover,et al.  Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution , 2020, 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[36]  Rahim Tafazolli,et al.  Software defined 5G networks for anything as a service [Guest Editorial] , 2015, IEEE Commun. Mag..

[37]  Emilio Insfran,et al.  On the Effectiveness of Tools to Support Infrastructure as Code: Model-Driven Versus Code-Centric , 2020, IEEE Access.

[38]  Sarah R. Smith,et al.  The Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP): Illuminating the Functional Diversity of Eukaryotic Life in the Oceans through Transcriptome Sequencing , 2014, PLoS biology.

[39]  Morgan Taschuk,et al.  Ten simple rules for making research software more robust , 2016, PLoS Comput. Biol..

[40]  Horst Lichter,et al.  Code Smells in Infrastructure as Code , 2018, 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC).

[41]  Dario Di Nucci,et al.  AnsibleMetrics: A Python library for measuring Infrastructure-as-Code blueprints in Ansible , 2020, SoftwareX.

[42]  Ian M. Mitchell,et al.  Best Practices for Scientific Computing , 2012, PLoS biology.

[43]  Iago Abal,et al.  Variability Bugs in Highly Configurable Systems , 2018, ACM Trans. Softw. Eng. Methodol..

[44]  Willem-Jan van den Heuvel,et al.  The pains and gains of microservices: A Systematic grey literature review , 2018, J. Syst. Softw..

[45]  Akond Rahman,et al.  XI Commandments of Kubernetes Security: A Systematization of Knowledge Related to Kubernetes Security Practices , 2020, 2020 IEEE Secure Development (SecDev).

[46]  Fabio Palomba,et al.  Towards a Catalogue of Software Quality Metrics for Infrastructure Code , 2020, J. Syst. Softw..

[47]  Patricia Lago,et al.  Guidelines for Architecting Android Apps: A Mixed-Method Empirical Study , 2019, 2019 IEEE International Conference on Software Architecture (ICSA).

[48]  Akond Rahman,et al.  Testing practices for infrastructure as code , 2020, Proceedings of the 1st ACM SIGSOFT International Workshop on Languages and Tools for Next-Generation Testing.

[49]  Jinesh Varia,et al.  Best Practices in Architecting Cloud Applications in the AWS Cloud , 2011 .

[50]  Lex Nederbragt,et al.  Good enough practices in scientific computing , 2016, PLoS Comput. Biol..

[51]  Jürgen Cito,et al.  Structured Information on State and Evolution of Dockerfiles on GitHub , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[52]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[53]  Michael Hüttermann,et al.  Infrastructure as Code , 2012 .

[54]  Damian A. Tamburri,et al.  TOSCA Solves Big Problems in the Cloud and Beyond! , 2018, IEEE Cloud Computing.

[55]  Magnus C. Ohlsson,et al.  Experimentation in Software Engineering , 2000, The Kluwer International Series in Software Engineering.

[56]  Kief Morris,et al.  Infrastructure as Code: Managing Servers in the Cloud , 2016 .

[57]  Steffen Staab,et al.  What Is an Ontology? , 2009, Handbook on Ontologies.

[58]  Rajiv Ranjan,et al.  A Taxonomy and Survey of Cloud Resource Orchestration Techniques , 2017, ACM Comput. Surv..