Practical and accurate pinpointing of configuration errors using static analysis

Software misconfigurations are responsible for a substantial part of today's system failures, causing about one-quarter of all customer-reported issues. Identifying their root causes can be costly in terms of time and human resources. We present an approach to automatically pinpoint such defects without error reproduction. It uses static analysis to infer the correlation degree between each configuration option and program sites affected by an exception. The only run-time information required by our approach is the stack trace of a failure. This is an essential advantage compared to existing approaches which require to reproduce errors or to provide testing oracles. We evaluate our approach on 29 errors from 4 configurable software programs, namely JChord, Randoop, Hadoop, and Hbase. Our approach can successfully diagnose 27 out of 29 errors. For 20 errors, the failure-inducing configuration option is ranked first.

[1]  Helen J. Wang,et al.  Automatic Misconfiguration Troubleshooting with PeerPressure , 2004, OSDI.

[2]  Shinji Kikuchi,et al.  Misconfiguration detection for cloud datacenters using decision tree analysis , 2012, 2012 14th Asia-Pacific Network Operations and Management Symposium (APNOMS).

[3]  Steven D. Gribble,et al.  Configuration Debugging as Search: Finding the Needle in the Haystack , 2004, OSDI.

[4]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2010, IEEE Trans. Software Eng..

[5]  Richard P. Martin,et al.  Understanding and Dealing with Operator Mistakes in Internet Services , 2004, OSDI.

[6]  Junfeng Yang,et al.  Context-based Online Configuration-Error Detection , 2011, USENIX Annual Technical Conference.

[7]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[8]  Rahul Premraj,et al.  Do stack traces help developers fix bugs? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[9]  Artur Andrzejak,et al.  Automated diagnosis of software misconfigurations based on static analysis , 2013, 2013 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[10]  Lorenzo Keller,et al.  ConfErr: A tool for assessing resilience to human configuration errors , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[11]  Randy H. Katz,et al.  Precomputing possible configuration error diagnoses , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[12]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[13]  Jason Flinn,et al.  Automatically Generating Predicates and Solutions for Configuration Troubleshooting , 2009, USENIX Annual Technical Conference.

[14]  Xiao Ma,et al.  An empirical study on configuration errors in commercial and open source systems , 2011, SOSP.

[15]  Mona Attariyan,et al.  AutoBash: improving configuration management with operating system causality analysis , 2007, SOSP.

[16]  Martin Szummer,et al.  Snitch: interactive decision trees for troubleshooting misconfigurations , 2007 .

[17]  Bhavana S. Pansare,et al.  Information Needs in Bug Reports : Improving Cooperation between Developers and Users , 2015 .

[18]  Ding Yuan,et al.  SherLog: error diagnosis by connecting clues from run-time logs , 2010, ASPLOS XV.

[19]  Tianyin Xu,et al.  EnCore: exploiting system environment and correlation information for misconfiguration detection , 2014, ASPLOS.

[20]  David Grove,et al.  Call graph construction in object-oriented languages , 1997, OOPSLA '97.

[21]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[22]  Mona Attariyan,et al.  Automating Configuration Troubleshooting with Dynamic Information Flow Analysis , 2010, OSDI.

[23]  Helen J. Wang,et al.  Strider: a black-box, state-based approach to change and configuration management and support , 2003, Sci. Comput. Program..

[24]  Randy H. Katz,et al.  How Hadoop Clusters Break , 2013, IEEE Software.

[25]  Michael D. Ernst,et al.  Automated diagnosis of software configuration errors , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[26]  Mona Attariyan,et al.  Using Causality to Diagnose Configuration Bugs , 2008, USENIX Annual Technical Conference.

[27]  Anees Shaikh,et al.  Automatic Software Fault Diagnosis by Exploiting Application Signatures , 2008, LISA.

[28]  Yuanyuan Zhou,et al.  Do not blame users for misconfigurations , 2013, SOSP.

[29]  Archana Ganapathi,et al.  Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.