Holistically Twig Matching in Probabilistic XML

Traditional databases manage only deterministic information, but now many applications that use databases involve uncertain data. For example, it is infeasible for a sensor database to contain only the exact value of each sensor at all points in time. The uncertainty is inherent in these systems due to measurement and sampling errors, and resource limitations. This paper aims at the query processing algorithm of twig patterns on probabilistic XML documents. The existing algorithms evaluate twig patterns in a traversal way. The main shortcoming of this way is scanning the whole probabilistic XML document to get the final results. In this paper, we first represent a probabilistic XML document in the form of probabilistic tag streams and then match them in a holistic way. Extensive experiments are conducted and show that the proposed holistic way has the higher performance than the traversal way.

[1]  Tok Wang Ling,et al.  From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching , 2005, VLDB.

[2]  Srinivasan Parthasarathy,et al.  A Decomposition-Based Probabilistic Framework for Estimating the Selectivity of XML Twig Queries , 2006, EDBT.

[3]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[4]  H. V. Jagadish,et al.  ProTDB: Probabilistic Data in XML , 2002, VLDB.

[5]  Tok Wang Ling,et al.  Efficient processing of XML twig patterns with parent child edges: a look-ahead approach , 2004, CIKM '04.

[6]  Hongjun Lu,et al.  Holistic Twig Joins on Indexed XML Documents , 2003, VLDB.

[7]  Te Li,et al.  PEPX: a query-friendly probabilistic XML database , 2006, CIKM '06.

[8]  Yehoshua Sagiv,et al.  Query efficiency in probabilistic XML models , 2008, SIGMOD Conference.

[9]  Daniela Florescu,et al.  Storing and Querying XML Data using an RDMBS , 1999, IEEE Data Eng. Bull..

[10]  Yehoshua Sagiv,et al.  Matching Twigs in Probabilistic XML , 2007, VLDB.

[11]  Derick Wood,et al.  On the Optimality of Holistic Algorithms for Twig Queries , 2003, DEXA.

[12]  Serge Abiteboul,et al.  Querying and Updating Probabilistic Information in XML , 2006, EDBT.

[13]  Tok Wang Ling,et al.  On boosting holism in XML twig pattern matching using structural indexing techniques , 2005, SIGMOD '05.

[14]  Vishu Krishnamurthy,et al.  Performance Challenges in Object-Relational DBMSs , 1999, IEEE Data Eng. Bull..

[15]  V. S. Subrahmanian,et al.  PXML: a probabilistic semistructured data model and algebra , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[16]  Serge Abiteboul,et al.  On the complexity of managing probabilistic XML data , 2007, PODS '07.

[17]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[18]  V. S. Subrahmanian,et al.  Probabilistic interval XML , 2003, TOCL.