Static approximation of dynamically generated Web pages

Server-side programming is one of the key technologies that support today's WWW environment. It makes it possible to generate Web pages dynamically according to a user's request and to customize pages for each user. However, the flexibility obtained by server-side programming makes it much harder to guarantee validity and security of dynamically generated pages.To check statically the properties of Web pages generated dynamically by a server-side program, we develop a static program analysis that approximates the string output of a program with a context-free grammar. The approximation obtained by the analyzer can be used to check various properties of a server-side program and the pages it generates.To demonstrate the effectiveness of the analysis, we have implemented a string analyzer for the server-side scripting language PHP. The analyzer is successfully applied to publicly available PHP programs to detect cross-site scripting vulnerabilities and to validate pages they generate dynamically.

[1]  Larry Wall,et al.  Programming Perl , 1991 .

[2]  Peter Ørbæk Can you Trust your Data? , 1995, TAPSOFT.

[3]  X. Leroy The Objective Caml system release 3.09 Documentation and user''s manual , 2005 .

[4]  Paul Barry,et al.  Programming Perl 3rd Edition , 2000 .

[5]  M. Wegman,et al.  Global value numbers and redundant computations , 1988, POPL '88.

[6]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[7]  Premkumar T. Devanbu,et al.  Static checking of dynamically generated queries in database applications , 2004, Proceedings. 26th International Conference on Software Engineering.

[8]  Peter Thiemann Grammar-based analysis of string expressions , 2005, TLDI '05.

[9]  Jens Palsberg,et al.  Trust in the λ-calculus , 1995, Journal of Functional Programming.

[10]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[11]  Bowen Alpern,et al.  Detecting equality of variables in programs , 1988, POPL '88.

[12]  Xavier Leroy The objective caml system release 3 , 2001 .

[13]  Richard Sproat,et al.  An Efficient Compiler for Weighted Rewrite Rules , 1996, ACL.

[14]  Akinori Yonezawa,et al.  Regular Expression Types for Strings in a Text Processing Language , 2002, Electron. Notes Theor. Comput. Sci..

[15]  Thomas W. Reps,et al.  Program analysis via graph reachability , 1997, Inf. Softw. Technol..

[16]  Dorothy E. Denning,et al.  A lattice model of secure information flow , 1976, CACM.

[17]  Claus Brabrand,et al.  The < bigwig > Project , 2022 .

[18]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[19]  Peter Ørbæk Can you Trust your Data , 1995 .

[20]  D. T. Lee,et al.  Securing web application code by static analysis and runtime protection , 2004, WWW '04.

[21]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ACM SIGPLAN Notices.

[22]  Peter J. Denning,et al.  Certification of programs for secure information flow , 1977, CACM.

[23]  J. Christopher Ramming,et al.  Programming the Web: An Application-Oriented Language for Hypermedia Service Programming , 1996, World Wide Web journal.

[24]  Aske Simon Christensen,et al.  Extending Java for High-Level Web Service Construction , 2002 .

[25]  Mark-Jan Nederhof,et al.  Regular Approximation of Context-Free Grammars through Transformation , 2001 .

[26]  Thomas W. Reps,et al.  Interconvertibility of a class of set constraints and context-free-language reachability , 2000, Theor. Comput. Sci..