Configuration smells in continuous delivery pipelines: a linter and a six-month study on GitLab

An effective and efficient application of Continuous Integration (CI) and Delivery (CD) requires software projects to follow certain principles and good practices. Configuring such a CI/CD pipeline is challenging and error-prone. Therefore, automated linters have been proposed to detect errors in the pipeline. While existing linters identify syntactic errors, detect security vulnerabilities or misuse of the features provided by build servers, they do not support developers that want to prevent common misconfigurations of a CD pipeline that potentially violate CD principles (“CD smells”). To this end, we propose CD-Linter, a semantic linter that can automatically identify four different smells in pipeline configuration files. We have evaluated our approach through a large-scale and long-term study that consists of (i) monitoring 145 issues (opened in as many open-source projects) over a period of 6 months, (ii) manually validating the detection precision and recall on a representative sample of issues, and (iii) assessing the magnitude of the observed smells on 5,312 open-source projects on GitLab. Our results show that CD smells are accepted and fixed by most of the developers and our linter achieves a precision of 87% and a recall of 94%. Those smells can be frequently observed in the wild, as 31% of projects with long configurations are affected by at least one smell.

[1]  Shane McIntosh,et al.  Use and Misuse of Continuous Integration Features: An Empirical Study of Projects That (Mis)Use Travis CI , 2020, IEEE Transactions on Software Engineering.

[2]  Diomidis Spinellis,et al.  Does Your Configuration Code Smell? , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[3]  Simon Urli,et al.  How to Design a Program Repair Bot? Insights from the Repairnator Project , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[4]  Harald C. Gall,et al.  Every build you break: developer-oriented assistance for build failure resolution , 2019, Empirical Software Engineering.

[5]  Shane McIntosh,et al.  Automatically repairing dependency-related build breakage , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[6]  Gerardo Canfora,et al.  An empirical characterization of bad practices in continuous integration , 2020, Empirical Software Engineering.

[7]  Daniel Alencar da Costa,et al.  Studying the Impact of Noises in Build Breakage Data , 2019, IEEE Transactions on Software Engineering.

[8]  Darko Marinov,et al.  Usage, costs, and benefits of continuous integration in open-source projects , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.

[10]  Harald C. Gall,et al.  Automated Reporting of Anti-Patterns and Decay in Continuous Integration , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[11]  Shane McIntosh,et al.  Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  Margaret-Anne D. Storey,et al.  Software Bots , 2017, IEEE Software.

[13]  Darko Marinov,et al.  Trade-offs in continuous integration: assurance, security, and flexibility , 2017, ESEC/SIGSOFT FSE.

[14]  Premkumar T. Devanbu,et al.  Quality and productivity outcomes relating to continuous integration in GitHub , 2015, ESEC/SIGSOFT FSE.

[15]  Chris Parnin,et al.  The Seven Sins: Security Smells in Infrastructure as Code Scripts , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[16]  Laurie A. Williams,et al.  Continuous Deployment at Facebook and OANDA , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[17]  Andy Zaidman,et al.  A Tale of CI Build Failures: An Open Source and a Financial Organization Perspective , 2017, ICSME.

[18]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[19]  Margaret-Anne D. Storey,et al.  Defining and Classifying Software Bots: A Faceted Taxonomy , 2019, 2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE).

[20]  James M. Bieman,et al.  The Effectiveness of Automated Static Analysis Tools for Fault Detection and Refactoring Prediction , 2009, 2009 International Conference on Software Testing Verification and Validation.

[21]  Marco Tulio Valente,et al.  Static correspondence and correlation between field defects and warnings reported by a bug finding tool , 2011, Software Quality Journal.

[22]  Michael D. Ernst,et al.  Which warnings should I fix first? , 2007, ESEC-FSE '07.