SHErrLoc: A Static Holistic Error Locator

We introduce a general way to locate programmer mistakes that are detected by static analyses. The program analysis is expressed in a general constraint language that is powerful enough to model type checking, information flow analysis, dataflow analysis, and points-to analysis. Mistakes in program analysis result in unsatisfiable constraints. Given an unsatisfiable system of constraints, both satisfiable and unsatisfiable constraints are analyzed to identify the program expressions most likely to be the cause of unsatisfiability. The likelihood of different error explanations is evaluated under the assumption that the programmer’s code is mostly correct, so the simplest explanations are chosen, following Bayesian principles. For analyses that rely on programmer-stated assumptions, the diagnosis also identifies assumptions likely to have been omitted. The new error diagnosis approach has been implemented as a tool called SHErrLoc, which is applied to three very different program analyses, such as type inference for a highly expressive type system implemented by the Glasgow Haskell Compiler—including type classes, Generalized Algebraic Data Types (GADTs), and type families. The effectiveness of the approach is evaluated using previously collected programs containing errors. The results show that when compared to existing compilers and other tools, SHErrLoc consistently identifies the location of programmer errors significantly more accurately, without any language-specific heuristics.

[1]  Michael I. Jordan,et al.  Statistical debugging: simultaneous identification of multiple bugs , 2006, ICML.

[2]  Simon L. Peyton Jones,et al.  OutsideIn(X) Modular type inference with local assumptions , 2011, J. Funct. Program..

[3]  Martin Erwig,et al.  Counter-factual typing for debugging type errors , 2014, POPL.

[4]  Jurriaan Hage,et al.  Heuristics for Type Error Discovery and Recovery , 2006, IFL.

[5]  Luís Damas,et al.  Type assignment in programming languages , 1984 .

[6]  Bruce James McAdam On the unification of substitutions in type inference , 1999 .

[7]  Thomas W. Reps,et al.  Program analysis via graph reachability , 1997, Inf. Softw. Technol..

[8]  Alexander Aiken,et al.  Introduction to Set Constraint-Based Program Analysis , 1999, Sci. Comput. Program..

[9]  Simon L. Peyton Jones,et al.  Diagnosing type errors with class , 2015, PLDI.

[10]  Dawson R. Engler,et al.  From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[11]  A. Aiken,et al.  Flow-Insensitive Points-to Analysis with Term and Set Constraints , 1997 .

[12]  Manu Sridharan,et al.  Tech Report : A Practical Framework for Type Inference Error Explanation , 2016 .

[13]  Andrew C. Myers,et al.  A decentralized model for information flow control , 1997, SOSP.

[14]  Simon L. Peyton Jones,et al.  Practical type inference for arbitrary-rank types , 2007, Journal of Functional Programming.

[15]  Robin Milner,et al.  Definition of standard ML , 1990 .

[16]  Anil Nerode,et al.  Logic for Applications , 1997, Graduate Texts in Computer Science.

[17]  Simon Peyton Jones,et al.  The Glasgow Haskell Compiler , 2012 .

[18]  Bastiaan Heeren,et al.  Top quality type error Messages , 2005 .

[19]  Kwangkeun Yi,et al.  Proofs about a folklore let-polymorphic type inference algorithm , 1998, TOPL.

[20]  Jurriaan Hage,et al.  Security type error diagnosis for higher-order, polymorphic languages , 2013, PEPM '13.

[21]  Simon L. Peyton Jones,et al.  Report on the programming language Haskell: a non-strict, purely functional language version 1.2 , 1992, SIGP.

[22]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[23]  Mitchell Wand Finding the source of type errors , 1986, POPL '86.

[24]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[25]  Martin Erwig,et al.  Better Type-Error Messages Through Lazy Typing , 2013 .

[26]  T. B. Dinesh,et al.  Centrum Voor Wiskunde En Informatica Reportrapport a Slicing-based Approach for Locating Type Errors a Slicing-based Approach for Locating Type Errors , 2022 .

[27]  Sam Blackshear,et al.  Almost-correct specifications: a modular semantic framework for assigning confidence to warnings , 2013, PLDI 2013.

[28]  Venkatesh Choppella,et al.  Diagnosis of Ill-typed Programs , 1995 .

[29]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[30]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[31]  Alexander Aiken,et al.  Type inclusion constraints and type inference , 1993, FPCA '93.

[32]  Christian Haack,et al.  Type error slicing in implicitly typed higher-order languages , 2003, Sci. Comput. Program..

[33]  Andrew C. Myers,et al.  Sharing Mobile Code Securely with Information Flow Control , 2012, 2012 IEEE Symposium on Security and Privacy.

[34]  Gregory F. Johnson,et al.  A maximum-flow approach to anomaly isolation in unification-based incremental type inference , 1986, POPL '86.

[35]  Jeffrey S. Foster,et al.  Flow-insensitive type qualifiers , 2006, TOPL.

[36]  Isil Dillig,et al.  Automated error diagnosis using abductive inference , 2012, PLDI.

[37]  Bruce J. McAdam,et al.  Repairing type errors in functional programs , 2002 .

[38]  Danfeng Zhang,et al.  Toward general diagnosis of static errors , 2014, POPL.

[39]  Dorothy E. Denning,et al.  A lattice model of secure information flow , 1976, CACM.

[40]  Somesh Jha,et al.  Effective blame for information-flow violations , 2008, SIGSOFT '08/FSE-16.

[41]  Dan Grossman,et al.  Searching for type-error messages , 2007, PLDI '07.

[42]  Madhav V. Marathe,et al.  Formal-Language-Constrained Path Problems , 1997, SIAM J. Comput..

[43]  Zvonimir Pavlinovic,et al.  Finding minimum type error sources , 2014, Software Engineering & Management.

[44]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[45]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[46]  Benjamin Livshits,et al.  Merlin: specification inference for explicit information flow problems , 2009, PLDI '09.

[47]  Thomas W. Reps,et al.  Interconvertibility of a class of set constraints and context-free-language reachability , 2000, Theor. Comput. Sci..

[48]  Daan Leijen,et al.  Helium, for learning Haskell , 2003, Haskell '03.

[49]  Simon L. Peyton Jones,et al.  Let should not be generalized , 2010, TLDI '10.

[50]  Fairouz Kamareddine,et al.  A constraint system for a SML type error slicer , 2010 .

[51]  Martin Odersky,et al.  Type Inference with Constrained Types , 1999, Theory Pract. Object Syst..