Type inference for queries on semistructured data

We study the problem of type checking and type inference for queries over semistructured data. Introducing a novel traces technique, we show that the problem is difficult in general (NP-complete), but can be solved in PTIME for many practical cases, including, in particular, queries over XML data. Besides being interesting by itself, we show that type inference and the related traces technique have several important applications, facilitating query formulation, optimization, and verification.

[1]  Dan Suciu,et al.  Optimizing regular path expressions using graph schemas , 1998, Proceedings 14th International Conference on Data Engineering.

[2]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[3]  Dan Suciu,et al.  Adding Structure to Unstructured Data , 1997, ICDT.

[4]  Dan Suciu,et al.  A query language and optimization techniques for unstructured data , 1996, SIGMOD '96.

[5]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[6]  John C. Mitchell,et al.  Type Systems for Programming Languages , 1991, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[7]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[8]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[9]  Serge Abiteboul,et al.  Regular path queries with constraints , 1997, PODS '97.

[10]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[11]  Diego Calvanese,et al.  What can Knowledge Representation do for Semi-Structured Data? , 1998, AAAI/IAAI.

[12]  Catriel Beeri,et al.  Schemas for Integration and Translation of Structured and Semi-structured Data , 1999, ICDT.

[13]  Serge Abiteboul,et al.  Inferring structure in semistructured data , 1997, SGMD.

[14]  Dan Suciu,et al.  Catching the boat with Strudel: experiences with a Web-site management system , 1998, SIGMOD '98.

[15]  Yannis Papakonstantinou,et al.  Object Fusion in Mediator Systems , 1996, VLDB.

[16]  John C. Mitchell,et al.  Foundations for programming languages , 1996, Foundation of computing series.

[17]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[18]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[19]  Jennifer Widom,et al.  Querying Semistructured Heterogeneous Information , 1995, J. Syst. Integr..

[20]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.