Efficient static analysis of XML paths and types

We present an algorithm to solve XPath decision problems under regular tree type constraints and show its use to statically type-check XPath queries. To this end, we prove the decidability of a logic with converse for finite ordered trees whose time complexity is a simple exponential of the size of a formula. The logic corresponds to the alternation free modal μ-calculus without greatest fixpoint, restricted to finite trees, and where formulas are cycle-free. Our proof method is based on two auxiliary results. First, XML regular tree types and XPath expressions have a linear translation to cycle-free formulas. Second, the least and greatest fixpoints are equivalent for finite trees, hence the logic is closed under negation. Building on these results, we describe a practical, effective system for solving the satisfiability of a formula. The system has been experimented with some decision problems such as XPath emptiness, containment, overlap, and coverage, with or without type constraints. The benefit of the approach is that our system can be effectively used in static analyzers for programming languages manipulating both XPath expressions and XML type annotations (as input and output types).

[1]  Jean-Yves Vion-Dury,et al.  Logic-based XPath optimization , 2004, DocEng '04.

[2]  Pablo Barceló,et al.  Temporal logics over unranked trees , 2005, 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05).

[3]  Alan Schmitt,et al.  Static Analysis of XML Paths and Types , 2007 .

[4]  Orna Kupferman,et al.  The Weakness of Self-Complementation , 1999, STACS.

[5]  Gérard P. Huet,et al.  The Zipper , 1997, Journal of Functional Programming.

[6]  James W. Thatcher,et al.  Generalized finite automata theory with an application to a decision problem of second-order logic , 1968, Mathematical systems theory.

[7]  Benjamin C. Pierce,et al.  XDuce: A statically typed XML processing language , 2003, TOIT.

[8]  Dan Suciu,et al.  Containment and equivalence for a fragment of XPath , 2004, JACM.

[9]  Ulrike Sattler,et al.  BDD-based decision procedures for the modal logic K ★ , 2006, J. Appl. Non Class. Logics.

[10]  Edmund M. Clarke,et al.  Design and Synthesis of Synchronization Skeletons Using Branching-Time Temporal Logic , 1981, Logic of Programs.

[11]  Wenfei Fan,et al.  Secure XML querying with security views , 2004, SIGMOD '04.

[12]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[13]  Benjamin C. Pierce,et al.  Paths Into Patterns , 2004 .

[14]  Robert K. Brayton,et al.  Early quantification and partitioned transition relations , 1996, Proceedings International Conference on Computer Design. VLSI in Computers and Processors.

[15]  Pierre Genevès,et al.  Deciding XPath containment with MSO , 2007, Data Knowl. Eng..

[16]  Benjamin C. Pierce,et al.  Type-Based Optimization for Regular Patterns , 2005, DBPL.

[17]  John Doner,et al.  Tree Acceptors and Some of Their Applications , 1970, J. Comput. Syst. Sci..

[18]  Benjamin C. Pierce,et al.  Regular Object Types , 2003, ECOOP.

[19]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[20]  Anders Møller,et al.  Static Validation of XSL Transformations , 2005 .

[21]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[22]  Tim Furche,et al.  XPath: Looking Forward , 2002, EDBT Workshops.

[23]  M. de Rijke,et al.  PDL for ordered trees , 2005, J. Appl. Non Class. Logics.

[24]  Stephan Merz,et al.  Model Checking , 2000 .

[25]  Silvano Dal-Zilio,et al.  A logic you can count on , 2004, POPL.

[26]  Moshe Y. Vardi Reasoning about The Past with Two-Way Automata , 1998, ICALP.

[27]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2005, ACM Trans. Program. Lang. Syst..

[28]  Dexter Kozen,et al.  Results on the Propositional µ-Calculus , 1982, ICALP.

[29]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ACM SIGPLAN Notices.

[30]  Paul J. Walmsley,et al.  XML Schema Part 0: Primer Second Edition , 2004 .

[31]  Thomas Schwentick,et al.  XPath query containment , 2004, SGMD.

[32]  Thomas Wilke,et al.  Automata logics, and infinite games: a guide to current research , 2002 .

[33]  Paolo Manghi,et al.  Static analysis for path correctness of XML queries , 2006, J. Funct. Program..

[34]  Thomas Schwentick,et al.  XPath Containment in the Presence of Disjunction, DTDs, and Variables , 2003, ICDT.

[35]  Edmund M. Clarke,et al.  Model Checking , 1999, Handbook of Automated Reasoning.

[36]  Masami Hagiya,et al.  A Decision Procedure for the Alternation-Free Two-Way Modal µ-Calculus , 2005, TABLEAUX.

[37]  Pierre Genevès,et al.  A system for the static analysis of XPath , 2006, TOIS.

[38]  Michael Benedikt,et al.  XPath satisfiability in the presence of DTDs , 2008, JACM.

[39]  Richard E. Ladner,et al.  Propositional Dynamic Logic of Regular Programs , 1979, J. Comput. Syst. Sci..

[40]  Maarten Marx,et al.  Conditional XPath, the first order complete XPath dialect , 2004, PODS.

[41]  Maarten Marx,et al.  XPath with Conditional Axis Relations , 2004, EDBT.

[42]  E. Allen Emerson,et al.  Tree automata, mu-calculus and determinacy , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[43]  Michael I. Schwartzbach,et al.  The Design Space of Type Checkers for XML Transformation Languages , 2004 .

[44]  Murali Mani,et al.  Taxonomy of XML schema languages using formal language theory , 2005, TOIT.