A Decision Procedure for Path Feasibility of String Manipulating Programs with Integer Data Type

Strings are widely used in programs, especially in web applications. Integer data type occurs naturally in string-manipulating programs, and is frequently used to refer to lengths of, or positions in, strings. Analysis and testing of string-manipulating programs can be formulated as the path feasibility problem: given a symbolic execution path, does there exist an assignment to the inputs that yields a concrete execution that realizes this path? Such a problem can naturally be reformulated as a string constraint solving problem. Although state-of-the-art string constraint solvers usually provide support for both string and integer data types, they mainly resort to heuristics without completeness guarantees. In this paper, we propose a decision procedure for a class of string-manipulating programs which includes not only a wide range of string operations such as concatenation, replaceAll, reverse, and finite transducers, but also those involving the integer data-type such as length, indexof, and substring. To the best of our knowledge, this represents one of the most expressive string constraint languages that is currently known to be decidable. Our decision procedure is based on a variant of cost register automata. We implement the decision procedure, giving rise to a new solver OSTRICH+. We evaluate the performance of OSTRICH+ on a wide range of existing and new benchmarks. The experimental results show that OSTRICH+ is the first string decision procedure capable of tackling finite transducers and integer constraints, whilst its overall performance is comparable with the state-of-the-art string constraint solvers.

[1]  Anthony Widjaja Lin,et al.  String solving with word equations and transducers: towards a logic for analysing mutation XSS , 2015, POPL.

[2]  Yunhui Zheng,et al.  ZSstrS: A string solver with theory-aware heuristics , 2017, 2017 Formal Methods in Computer Aided Design (FMCAD).

[3]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[4]  Benjamin Livshits,et al.  Fast and Precise Sanitizer Analysis with BEK , 2011, USENIX Security Symposium.

[5]  Parosh Aziz Abdulla,et al.  String Constraints for Verification , 2014, CAV.

[6]  Florin Manea,et al.  The Satisfiability of Word Equations: Decidable and Undecidable Theories , 2018, RP.

[7]  Rajeev Alur,et al.  Regular Functions and Cost Register Automata , 2013, 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science.

[8]  Rupak Majumdar,et al.  Quadratic Word Equations with Length Constraints, Counter Systems, and Presburger Arithmetic with Divisibility , 2018, ATVA.

[9]  Marco Roveri,et al.  The nuXmv Symbolic Model Checker , 2014, CAV.

[10]  Oscar H. Ibarra,et al.  Automata-based symbolic string analysis for vulnerability detection , 2014, Formal Methods Syst. Des..

[11]  Philipp Rümmer,et al.  String constraints with concatenation and transducers solved efficiently , 2017, Proc. ACM Program. Lang..

[12]  Joxan Jaffar,et al.  Progressive Reasoning over Recursively-Defined Strings , 2016, CAV.

[13]  Thomas Schwentick,et al.  On the Complexity of Equational Horn Clauses , 2005, CADE.

[14]  Jie-Hong Roland Jiang,et al.  String Analysis via Automata Manipulation with Logic Circuit Representation , 2016, CAV.

[15]  Cesare Tinelli,et al.  Scaling Up DPLL(T) String Solvers Using Context-Dependent Simplification , 2017, CAV.

[16]  Steve Hanna,et al.  A Symbolic Execution Framework for JavaScript , 2010, 2010 IEEE Symposium on Security and Privacy.

[17]  Jie-Hong Roland Jiang,et al.  A Symbolic Model Checking Approach to the Analysis of String and Length Constraints , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Xiangyu Zhang,et al.  Z3-str: a z3-based string solver for web application analysis , 2013, ESEC/FSE 2013.

[19]  Koushik Sen,et al.  Symbolic execution for software testing: three decades later , 2013, CACM.

[20]  Armando Solar-Lezama,et al.  Word Equations with Length Constraints: What's Decidable? , 2012, Haifa Verification Conference.

[21]  Nikolaj Bjørner,et al.  Path Feasibility Analysis for String-Manipulating Programs , 2009, TACAS.

[22]  Philipp Rümmer,et al.  A Constraint Sequent Calculus for First-Order Logic with Linear Integer Arithmetic , 2008, LPAR.

[23]  Yan Chen,et al.  What Is Decidable about String Constraints with the ReplaceAll Function , 2017, 1711.03363.

[24]  Joxan Jaffar,et al.  S3: A Symbolic String Solver for Vulnerability Detection in Web Applications , 2014, CCS.

[25]  Parosh Aziz Abdulla,et al.  Chain-Free String Constraints , 2019, ATVA.

[26]  Philipp Rümmer,et al.  Decision procedures for path feasibility of string-manipulating programs with complex operations , 2018, Proc. ACM Program. Lang..

[27]  Salil P. Vadhan,et al.  Computational Complexity , 2005, Encyclopedia of Cryptography and Security.

[28]  J. Richard Büchi,et al.  Definability in the Existential Theory of Concatenation and Undecidable Extensions of this Theory , 1988, Math. Log. Q..

[29]  Cesare Tinelli,et al.  A DPLL(T) Theory Solver for a Theory of Strings and Regular Expressions , 2014, CAV.

[30]  Parosh Aziz Abdulla,et al.  Flatten and conquer: a framework for efficient analysis of string constraints , 2017, PLDI.