Automatic Rootcausing for Program Equivalence Failures in Binaries

Equivalence checking of imperative programs has several applications including compiler validation and cross-version verification. Debugging equivalence failures can be tedious for large examples, especially for low-level binary programs. In this paper, we formalize a simple yet precise notion of verifiable rootcause for equivalence failures that leverages semantic similarity between two programs. Unlike existing works on program repair, our definition of rootcause avoids the need for a template of fixes or the need for a complete repair to ensure equivalence. We show progressively weaker checks for detecting rootcauses that can be applicable even when multiple fixes are required to make the two programs equivalent. We provide optimizations based on Maximum Satisfiability (MAXSAT) and binary search to prune the search space of such rootcauses. We have implemented the techniques in SymDiff and provide an evaluation on a set of real-world compiler validation binary benchmarks.

[1]  Sanjit A. Seshia,et al.  Sketching stencils , 2007, PLDI '07.

[2]  Rupak Majumdar,et al.  Cause clue clauses: error localization using maximum satisfiability , 2010, PLDI '11.

[3]  Chris Hawblitzel,et al.  Safe to the last instruction: automated verification of a type-safe operating system , 2011, CACM.

[4]  Dawson R. Engler,et al.  Practical, Low-Effort Equivalence Verification of Real Code , 2011, CAV.

[5]  Emina Torlak,et al.  Angelic debugging , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[6]  Sorin Lerner,et al.  Proving optimizations correct using parameterized program equivalence , 2009, PLDI '09.

[7]  Sumit Gulwani,et al.  Automated feedback generation for introductory programming assignments , 2012, PLDI.

[8]  Westley Weimer,et al.  Patches as better bug reports , 2006, GPCE '06.

[9]  Thomas Wies,et al.  Error Invariants , 2012, FM.

[10]  Hanan Samet Compiler testing via symbolic interpretation , 1976, ACM '76.

[11]  Shuvendu K. Lahiri,et al.  Unifying type checking and property checking for low-level code , 2009, POPL '09.

[12]  Shuvendu K. Lahiri,et al.  SYMDIFF: A Language-Agnostic Semantic Diff Tool for Imperative Programs , 2012, CAV.

[13]  Jyotirmoy V. Deshmukh,et al.  Automatic Generation of Local Repairs for Boolean Programs , 2008, 2008 Formal Methods in Computer-Aided Design.

[14]  Ofer Strichman,et al.  Regression verification , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[15]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[16]  Roderick Bloem,et al.  Automated error localization and correction for imperative programs , 2011, 2011 Formal Methods in Computer-Aided Design (FMCAD).

[17]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[18]  George C. Necula,et al.  Translation validation for an optimizing compiler , 2000, PLDI '00.

[19]  K. Rustan M. Leino,et al.  Weakest-precondition of unstructured programs , 2005, PASTE '05.

[20]  Shuvendu K. Lahiri,et al.  Differential assertion checking , 2013, ESEC/FSE 2013.

[21]  Shuvendu K. Lahiri,et al.  Will you still compile me tomorrow? static cross-version compiler validation , 2013, ESEC/FSE 2013.

[22]  Vikram S. Adve,et al.  Using likely invariants for automated software fault localization , 2013, ASPLOS '13.