Undecidability of a Theory of Strings, Linear Arithmetic over Length, and String-Number Conversion

In recent years there has been considerable interest in theories over string equations, length function, and string-number conversion predicate within the formal verification, software engineering, and security communities. SMT solvers for these theories, such as Z3str2, CVC4, and S3, are of immense practical value in exposing security vulnerabilities in string-intensive programs. Additionally, there are many open decidability and complexity-theoretic questions in the context of theories over strings that are of great interest to mathematicians. Motivated by the above-mentioned applications and open questions, we study a first-order, many-sorted, quantifier-free theory $T_{s,n}$ of string equations, linear arithmetic over string length, and string-number conversion predicate and prove three theorems. First, we prove that the satisfiability problem for the theory $T_{s,n}$ is undecidable via a reduction from a theory of linear arithmetic over natural numbers with power predicate, we call power arithmetic. Second, we show that the string-numeric conversion predicate is expressible in terms of the power predicate, string equations, and length function. This second theorem, in conjunction with the reduction we propose for the undecidability theorem, suggests that the power predicate is expressible in terms of word equations and length function if and only if the string-numeric conversion predicate is also expressible in the same fragment. Such results are very useful tools in comparing the expressive power of different theories, and for establishing decidability and complexity results. Third, we provide a consistent axiomatization ${\Gamma}$ for the functions and predicates of $T_{s,n}$. Additionally, we prove that the theory $T_{\Gamma}$ , obtained via logical closure of ${\Gamma}$, is not a complete theory.

[1]  Wojciech Plandowski,et al.  Two-variable word equations , 2000, RAIRO Theor. Informatics Appl..

[2]  Steve Hanna,et al.  A Symbolic Execution Framework for JavaScript , 2010, 2010 IEEE Symposium on Security and Privacy.

[3]  Yu. V. Matiyasevich The Connection between Hilbert’s Tenth Problem and Systems of Equations between Words and Lengths , 1970 .

[4]  Artur Jez,et al.  Recompression: a simple and powerful technique for word equations , 2012, STACS.

[5]  Wojciech Plandowski Satisfiability of word equations with constants is in PSPACE , 2004, JACM.

[6]  Yuri Matiyasevich Computation Paradigms in Light of Hilbert's Tenth Problem , 2008 .

[7]  Pietro Ferrara,et al.  Hybrid security analysis of web JavaScript code via dynamic partial evaluation , 2014, ISSTA 2014.

[8]  J. Richard Büchi,et al.  Definability in the Existential Theory of Concatenation and Undecidable Extensions of this Theory , 1988, Math. Log. Q..

[9]  Xiangyu Zhang,et al.  Effective Search-Space Pruning for Solvers of String Equations, Regular Expressions and Length Constraints , 2015, CAV.

[10]  Anca Muscholl,et al.  Solving Word Equations modulo Partial Commutations , 1999, Theor. Comput. Sci..

[11]  Anca Muscholl,et al.  Solving Trace Equations Using Lexicographical Normal Forms , 1997, ICALP.

[12]  Robert Dabrowski,et al.  On Word Equations in One Variable , 2002, Algorithmica.

[13]  Joxan Jaffar,et al.  S3: A Symbolic String Solver for Vulnerability Detection in Web Applications , 2014, CCS.

[14]  Yuri V. Matiyasevich Some Decision Problems for Traces , 1997, LFCS.

[15]  G. Makanin The Problem of Solvability of Equations in a Free Semigroup , 1977 .

[16]  Carlo A. Furia What's Decidable about Sequences? , 2010, ATVA.

[17]  Michael D. Ernst,et al.  HAMPI: A String Solver for Testing, Analysis and Vulnerability Detection , 2011, CAV.

[18]  Klaus U. Schulz,et al.  Makanin's Algorithm for Word Equations - Two Improvements and a Generalization , 1990, IWWERT.

[19]  J. Allouche Algebraic Combinatorics on Words , 2005 .

[20]  Armando Solar-Lezama,et al.  Word Equations with Length Constraints: What's Decidable? , 2012, Haifa Verification Conference.

[21]  Michael D. Ernst,et al.  HAMPI: a solver for string constraints , 2009, ISSTA.

[22]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[23]  Witold Charatonik,et al.  Word Equations with Two Variables , 1991, IWWERT.

[24]  Rupak Majumdar,et al.  Dynamic test input generation for database applications , 2007, ISSTA '07.

[25]  J. Karhumäki,et al.  ALGEBRAIC COMBINATORICS ON WORDS (Encyclopedia of Mathematics and its Applications 90) By M. LOTHAIRE: 504 pp., 60.00, ISBN 0 521 81220 8 (Cambridge University Press, 2002) , 2003 .

[26]  W. V. Quine,et al.  Concatenation as a basis for arithmetic , 1946, Journal of Symbolic Logic.

[27]  V. Durnev Undecidability of the positive ∀∃3-theory of a free semigroup , 1995 .

[28]  Wojciech Plandowski,et al.  The expressibility of languages and relations by word equations , 1997, JACM.

[29]  Volker Diekert,et al.  Quadratic Word Equations , 1999, Jewels are Forever.

[30]  Artur Jez Recompression: A Simple and Powerful Technique for Word Equations , 2016, J. ACM.

[31]  Wojciech Plandowski,et al.  An efficient algorithm for solving word equations , 2006, STOC '06.

[32]  W. Marsden I and J , 2012 .

[33]  Chen C. Chang,et al.  Model Theory: Third Edition (Dover Books On Mathematics) By C.C. Chang;H. Jerome Keisler;Mathematics , 1966 .

[34]  Zhendong Su,et al.  Sound and precise analysis of web applications for injection vulnerabilities , 2007, PLDI '07.