A suite of abstract domains for static analysis of string values

Strings are widely used in modern programming languages in various scenarios. For instance, strings are used to build up Structured Query Language (SQL) queries that are then executed. Malformed strings may lead to subtle bugs, as well as non‐sanitized strings may raise security issues in an application. For these reasons, the application of static analysis to compute safety properties over string values at compile time is particularly appealing. In this article, we propose a generic approach for the static analysis of string values based on abstract interpretation. In particular, we design a suite of abstract semantics for strings, where each abstract domain tracks a different kind of information. We discuss the trade‐off between efficiency and accuracy when using such domains to catch the properties of interest. In this way, the analysis can be tuned at different levels of precision and efficiency, and it can address specific properties.Copyright © 2013 John Wiley & Sons, Ltd.

[1]  Yasuhiko Minamide,et al.  Static approximation of dynamically generated Web pages , 2005, WWW '05.

[2]  Christian Kirkegaard,et al.  Type Checking with XML Schema in XACT , 2005, PLAN-X.

[3]  Nicolas Halbwachs,et al.  Automatic discovery of linear restraints among variables of a program , 1978, POPL.

[4]  Michael I. Schwartzbach,et al.  Static validation of XSL transformations , 2005, TOPL.

[5]  Patrick Cousot,et al.  Formal language, grammar and set-constraint-based program analysis by abstract interpretation , 1995, FPCA '95.

[6]  Anders Møller,et al.  The Design Space of Type Checkers for XML Transformation Languages , 2004, ICDT.

[7]  Christian Kirkegaard,et al.  Static analysis of XML transformations in Java , 2003, IEEE Transactions on Software Engineering.

[8]  Claus Brabrand,et al.  Static validation of dynamically generated HTML , 2001, PASTE '01.

[9]  Oscar H. Ibarra,et al.  Relational String Verification Using Multi-Track Automata , 2011, Int. J. Found. Comput. Sci..

[10]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[11]  Agostino Cortesi,et al.  Widening and narrowing operators for abstract interpretation , 2011, Comput. Lang. Syst. Struct..

[12]  Anders Møller,et al.  HTML Validation of Context-Free Languages , 2011, FoSSaCS.

[13]  Se-Won Kim,et al.  String Analysis as an Abstract Interpretation , 2011, VMCAI.

[14]  Patrick Cousot,et al.  The ASTREÉ Analyzer , 2005, ESOP.

[15]  Pietro Ferrara,et al.  Automatic Inference of Access Permissions , 2012, VMCAI.

[16]  Akinori Yonezawa,et al.  Regular Expression Types for Strings in a Text Processing Language , 2002, Electron. Notes Theor. Comput. Sci..

[17]  Philippe Granger,et al.  Static Analysis of Linear Congruence Equalities among Variables of a Program , 1991, TAPSOFT, Vol.1.

[18]  David A. Schmidt,et al.  Abstract Parsing: Static Analysis of Dynamically Generated String Output Using LR-Parsing Technology , 2009, SAS.

[19]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[20]  Pietro Ferrara,et al.  Static Type Analysis of Pattern Matching by Abstract Interpretation , 2010, FMOODS/FORTE.

[21]  Fang Yu,et al.  String Abstractions for String Verification , 2011, SPIN.

[22]  Patrick Cousot,et al.  Abstract Interpretation and Application to Logic Programs , 1992, J. Log. Program..

[23]  Aske Simon Christensen,et al.  Static Analysis for Dynamic XML , 2002 .

[24]  Margus Veanes,et al.  An Evaluation of Automata Algorithms for String Analysis , 2011, VMCAI.

[25]  Premkumar T. Devanbu,et al.  Static checking of dynamically generated queries in database applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[26]  Peter Thiemann Grammar-based analysis of string expressions , 2005, TLDI '05.

[27]  Agostino Cortesi,et al.  Obfuscation-based analysis of SQL injection attacks , 2010, The IEEE symposium on Computers and Communications.

[28]  Pascal Van Hentenryck,et al.  Type analysis of Prolog using type graphs , 1994, PLDI '94.

[29]  Kyung-Goo Doh,et al.  A Practical String Analyzer by the Widening Approach , 2006, APLAS.

[30]  Manuel Fähndrich,et al.  Pentagons: A weakly relational domain for the efficient validation of array accesses , 2008 .

[31]  Agostino Cortesi,et al.  SAILS: static analysis of information leakage with sample , 2012, SAC '12.

[32]  Antoine Miné,et al.  The octagon abstract domain , 2001, High. Order Symb. Comput..

[33]  Christian Kirkegaard,et al.  Static Analysis for Java Servlets and JSP , 2006, SAS.

[34]  Pietro Ferrara,et al.  TVAL+ : TVLA and Value Analyses Together , 2012, SEFM.

[35]  Gerda Janssens,et al.  Deriving Descriptions of Possible Values of Program Variables by Means of Abstract Interpretation , 1990, J. Log. Program..

[36]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[37]  Peter Thiemann,et al.  Type Analysis for JavaScript , 2009, SAS.

[38]  Agostino Cortesi,et al.  Static Analysis of String Values , 2011, ICFEM.

[39]  Manuel Fähndrich,et al.  Pentagons: a weakly relational abstract domain for the efficient validation of array accesses , 2008, SAC '08.

[40]  Michael I. Schwartzbach,et al.  XML graphs in program analysis , 2007, PEPM '07.

[41]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[42]  Patrick Cousot,et al.  Systematic design of program analysis frameworks , 1979, POPL.

[43]  Oscar H. Ibarra,et al.  Symbolic String Verification: An Automata-Based Approach , 2008, SPIN.