ConfigFile++: Automatic comment enhancement for misconfiguration prevention

Nowadays, misconfiguration has become one of the key factors leading to system problems. Most current research on the topic explores misconfiguration diagnosis, but is less concerned with educating users about how to configure correctly in order to prevent misconfiguration before it happens. In this paper, we manually study 22 open source software projects and summarize several observations on the comments of their configuration files, most of which lack sufficient information and are poorly formatted. Based on these observations and the general process of misconfiguration diagnosis, we design and implement a tool called ConfigFile++ that automatically enhances the comment in configuration files. By using name-based analysis and machine learning, ConfigFile++ extracts guiding information about the configuration option from the user manual and source code, and inserts it into the configuration files. The format of insert comment is also designed to make enhanced comments concise and clear. We use real-world examples of misconfigurations to evaluate our tool. The results show that ConfigFile++ can prevent 33 out of 50 misconfigurations.

[1]  Mona Attariyan,et al.  Automating Configuration Troubleshooting with Dynamic Information Flow Analysis , 2010, OSDI.

[2]  Li Wang,et al.  ConfTest: Generating Comprehensive Misconfiguration for System Reaction Ability Evaluation , 2017, EASE.

[3]  Elmar Jürgens,et al.  Quality analysis of source code comments , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[4]  Yuanyuan Zhou,et al.  /*icomment: bugs or bad comments?*/ , 2007, SOSP.

[5]  Mona Attariyan,et al.  X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software , 2012, OSDI.

[6]  Li Wang,et al.  Automatic Type Inference for Proactive Misconfiguration Prevention , 2017, SEKE.

[7]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[8]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[9]  Randy H. Katz,et al.  How Hadoop Clusters Break , 2013, IEEE Software.

[10]  Horacio Rodríguez,et al.  A Machine Learning Approach to POS Tagging , 2000, Machine Learning.

[11]  Randy H. Katz,et al.  Static extraction of program configuration options , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[12]  Tao Xie,et al.  An Empirical Study on Evolution of API Documentation , 2011, FASE.

[13]  Yuanyuan Zhou,et al.  Early Detection of Configuration Errors to Reduce Failure Damage , 2016, USENIX Annual Technical Conference.

[14]  Yuanyuan Zhou,et al.  Do not blame users for misconfigurations , 2013, SOSP.

[15]  Gary T. Leavens,et al.  @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.