论文信息 - Enhanced Statistics for Element-Centered XML Summaries

Enhanced Statistics for Element-Centered XML Summaries

Element-centered XML summaries collect statistical information for document nodes and their axes relationships and aggregate them separately for each distinct element/attribute name. They have already partially proven their superiority in quality, space consumption, and evaluation performance. This kind of inversion seems to have more service capability than conventional approaches. Therefore, we refined and extended element-centered XML summaries to capture more statistical information and propose new estimation methods. We tested our ideas on a set of documents with largely varying characteristics.

Caetano Sauer | José de Aguiar Moraes Filho | Theo Härder

[1] Hongjun Lu,et al. Bloom Histogram: Path Selectivity Estimation for XML Data with Updates , 2004, VLDB.

[2] Neoklis Polyzotis,et al. XSKETCH synopses for XML data graphs , 2006, TODS.

[3] José de Aguiar Moraes Filho,et al. EXsum: an XML summarization framework , 2008, IDEAS '08.

[4] M. Tamer Özsu,et al. XSEED: Accurate and Fast Cardinality Estimation for XPath Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5] Juliana Freire,et al. StatiX: making XML count , 2002, SIGMOD '02.

[6] Jeffrey Scott Vitter,et al. XPathLearner: An On-line Self-Tuning Markov Histogram for XML Path Selectivity Estimation , 2002, VLDB.

[7] Jeffrey F. Naughton,et al. Estimating the Selectivity of XML Path Expressions for Internet Scale Applications , 2001, VLDB.

[8] Theo Härder,et al. An efficient infrastructure for native transactional XML processing , 2007, Data Knowl. Eng..