Modeling Regular Replacement for String Constraint Solving

Bugs in user input sanitation of software systems often leadto vulnerabilities. Among them many are caused by improper use of regular replacement. Thispaper presents a precise modeling of various semantics of regular substitution, such as the declarative, finite, greedy, and reluctant, using finite state transducers (FST). By projecting an FST toits input/output tapes, we are able to solve atomic string constraints, which can be applied to both the forward and backward image computation in model checking and symbolic execution of text processing programs. We report several interesting discoveries, e.g., certain fragmentsof the general problem can be handled using less expressive deterministic FST. A compact representation of FST is implemented in SUSHI, a string constraint solver. It is applied to detecting vulnerabilities in web applications.

[1]  Oscar H. Ibarra,et al.  Symbolic String Verification: An Automata-Based Approach , 2008, SPIN.

[2]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[3]  Christian Kirkegaard,et al.  Static Analysis for Java Servlets and JSP , 2006 .

[4]  Michael D. Ernst,et al.  HAMPI: a solver for string constraints , 2009, ISSTA.

[5]  Gregory Grefenstette,et al.  Regular expressions for language engineering , 1996, Natural Language Engineering.

[6]  Nikolaj Bjørner,et al.  Path Feasibility Analysis for String-Manipulating Programs , 2009, TACAS.

[7]  Premkumar T. Devanbu,et al.  JDBC checker: a static analysis tool for SQL/JDBC applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[8]  Benjamin Livshits,et al.  Vulnerabilities in Java Applications with Static Analysis , 2005 .

[9]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[10]  K. Qian,et al.  On Simple Linear String Equations , 2009 .

[11]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[12]  Westley Weimer,et al.  A decision procedure for subset constraints over regular languages , 2009, PLDI '09.

[13]  Cori Robert Combinatorics on words: Words and Trees , 1997 .

[14]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[15]  Gertjan van Noord FSA Utilities: A Toolbox to Manipulate Finite-State Automata , 1996, Workshop on Implementing Automata.

[16]  Xiang Fu,et al.  A Static Analysis Framework For Detecting SQL Injection Vulnerabilities , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[17]  M. Lothaire Algebraic Combinatorics on Words , 2002 .

[18]  Oscar H. Ibarra,et al.  Symbolic String Verification: Combining String Analysis and Size Analysis , 2009, TACAS.