A model for imperfect XML data based on Dempster-Shafer's theory of evidence

In this paper we present a model for uncertain tree-structured data, based on Dempster–Shafer’s theory of evidence. This theory is applied to the algebraic manipulation of tree-structured data for the first time, providing a sound basis for the application of interval probabilities. A major contribution of our work is the definition of general algebraic operators that can be applied in presence of interval probabilities. As a second contribution we extend the concept of local probability. This concept has already been used in the context of probabilistic XML data, and we apply it to probability intervals. For instance, an XML document about a patient record can contain many uncertain diagnoses, and we may want to focus on one of them, computing its local probabilities. We show how to evaluate uncertainty at single nodes, even when their local probabilities are not completely specified, or unknown. 1. Department of Computer Science, University of Bologna, Mura Anteo Zamboni 7, 40127 Bologna, Italy 2. Department of Mathematics and Informatics, University of Camerino, Via Madonna delle Carceri 9, 62032 Camerino MC, Italy

[1]  Suk Kyoon Lee,et al.  An Extended Relational Database Model for Uncertain and Imprecise Information , 1992, VLDB.

[2]  Alex Dekhtyar,et al.  Semistructured Probalistic Databases. , 2001 .

[3]  Matteo Magnani,et al.  XML and relational data: towards a common model and algebra , 2005, 9th International Database Engineering & Application Symposium (IDEAS'05).

[4]  Norman May,et al.  Nested queries and quantifiers in an ordered context , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Catriel Beeri,et al.  SAL: An Algebra for Semistructured Data and XML , 1999, WebDB.

[6]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[7]  Weiru Liu,et al.  Merging uncertain information with semantic heterogeneity in XML , 2006, Knowledge and Information Systems.

[8]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[9]  Philippe Smets,et al.  The Combination of Evidence in the Transferable Belief Model , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  H. V. Jagadish,et al.  ProTDB: Probabilistic Data in XML , 2002, VLDB.

[11]  Cong Yu,et al.  Querying structured text in an XML database , 2003, SIGMOD '03.

[12]  Laks V. S. Lakshmanan,et al.  TAX: A Tree Algebra for XML , 2001, DBPL.

[13]  V. S. Subrahmanian,et al.  Probabilistic interval XML , 2003, TOCL.

[14]  Flavius Frasincar,et al.  XAL: An Algebra For XML Query Optimization , 2002, Australasian Database Conference.

[15]  Awais Rashid,et al.  XML Data Management: Native XML and XML-Enabled Database Systems , 2003 .

[16]  Matteo Magnani,et al.  A unified approach to structured and XML data modeling and manipulation , 2006, Data Knowl. Eng..

[17]  Antonio Albano,et al.  Yet another query algebra for XML data , 2002, Proceedings International Database Engineering and Applications Symposium.

[18]  Danilo Montesi,et al.  Dimensions of ignorance in a semi-structured data model , 2004, Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004..

[19]  Laks V. S. Lakshmanan,et al.  ProbView: a flexible probabilistic database system , 1997, TODS.

[20]  Mounia Lalmas,et al.  Dempster-Shafer's theory of evidence applied to structured documents: modelling uncertainty , 1997, SIGIR '97.

[21]  V. S. Subrahmanian,et al.  PXML: a probabilistic semistructured data model and algebra , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).