Optimizing XML twig queries with full-text predicates

Efficient query processing has been a critical issue for XML repositories. In this paper, we consider the XML query which can be represented as a query tree with twig patterns, and also consists of full-text constraints. Previously, the structure-first approach and the keyword-first approach have been proposed to process such kind of queries. The main focus of this paper is constructing an integrated system to support these two approaches and find the best execution plan. To achieve this goal, we first analyze the components of these two approaches and design a set of operators. We then derive the corresponding cost model and rewriting rules to perform costbased optimization. We also propose several heuristic rules by observing the behaviors of the two approaches. Via an extensive experimental study, we demonstrate that our cost-based system and heuristic system are both effective.