Styler: Learning Formatting Conventions to Repair Checkstyle Errors

Formatting coding conventions play an important role on code readability. In this paper, we present Styler, an automatic repair tool dedicated to fix formatting-related errors raised by Checkstyle, a highly configurable format checker for Java. To fix formatting errors in a given project, Styler learns fixes based on the Checkstyle ruleset defined in the project and predicts repairs for the current errors using machine learning. In an empirical evaluation, we found that Styler repaired 24% of 497 real Checkstyle errors mined from five GitHub projects. Moreover, in a comparison of Styler with the state-of-the-art machine learning code formatters Naturalize and CodeBuff, we found that Styler is the tool that fixes more real Checkstyle errors and also generates smaller repairs. Finally, we conclude that Styler is promising to be used in IDEs and in a Continuous Integration environment to repair Checkstyle errors.

[1]  Sumit Gulwani,et al.  Compilation Error Repair: For the Student Programs, From the Student Programs , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET).

[2]  Andy Zaidman,et al.  A Tale of CI Build Failures: An Open Source and a Financial Organization Perspective , 2017, ICSME.

[3]  Rishabh Singh,et al.  Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks , 2016, ArXiv.

[4]  José Nelson Amaral,et al.  Syntax and sensibility: Using language models to detect and correct syntax errors , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[5]  Martin Monperrus,et al.  Automatic Software Repair , 2018, ACM Comput. Surv..

[6]  Premkumar T. Devanbu,et al.  A Survey of Machine Learning for Big Code and Naturalness , 2017, ACM Comput. Surv..

[7]  W. Marsden I and J , 2012 .

[8]  Charles A. Sutton,et al.  Learning natural coding conventions , 2014, SIGSOFT FSE.

[9]  Rahul Gupta,et al.  DeepFix: Fixing Common C Language Errors by Deep Learning , 2017, AAAI.

[10]  Maurício Aniche,et al.  The Adoption of JavaScript Linters in Practice: A Case Study on ESLint , 2020, IEEE Transactions on Software Engineering.

[11]  Jurgen J. Vinju,et al.  Towards a universal code formatter through machine learning , 2016, SLE.