Raindrop: a uniform and layered algebraic framework for XQueries on XML streams

XML stream applications bring the challenge of efficientlyprocessing queries on sequentially accessible token-based data.While the automata model is naturally suited for pattern matchingon tokenized XML streams, the algebraic model in contrast is awell-established technique for set-oriented processing ofself-contained tuples. However, neither automata nor algebraicmodels are well-equipped to handle both computation paradigms. The goal of the Raindrop project is to accommodate thesetwo paradigms within one algebraic framework to take advantage ofboth. In our query model, both tokenized data and self-containedtuples are supported in a uniform manner. Query plans can beflexibly rewritten using equivalence rules to change whatcomputation is done using tokenized data versus tuples. This paperhighlights the four abstraction levels in Raindrop, namely,semantics-focused plan, stream logical plan, stream physicalplan and execution plan. Various optimization techniquesare provided at each level. The necessity of such a uniform andlayered plan is shown by experimental study