Combining DAML+OIL, XSLT, and Probabilistic Logics for Uncertain Schema Mappings in MIND

When distributed, heterogeneous digital libraries have to be integrated, one of the crucial tasks is to map between different schemas. As schemas may have different granularities, and as schema attributes do not always match precisely, a general-purpose schema mapping approach requires support for uncertain mappings. In this paper we present one of the very few approaches for defining and using uncertain schema mappings. We combine different technologies like DAML+OIL, probabilistic Datalog (since DAML+OIL—as similar ontology languages—lacks rules) and XSLT for actually transforming queries and documents. This declarative approach is fully implemented in the project MIND (which develops methods for retrieval in networked multimedia digital libraries). However, as DAML+OIL lacks some important features, the proposed approach is only a stepping stone for an integrated solution.

[1]  Jeffrey D. Ullman,et al.  Principles Of Database And Knowledge-Base Systems , 1979 .

[2]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[3]  Norbert Fuhr,et al.  Probabilistic datalog: Implementing logical information retrieval for advanced applications , 2000, J. Am. Soc. Inf. Sci..

[4]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[5]  Norbert Fuhr,et al.  A Probabilistic Framework for Vague Queries and Imprecise Information in Databases , 1990, VLDB.

[6]  Dan Brickley,et al.  Resource Description Framework (RDF) Model and Syntax Specification , 2002 .

[7]  Ian Horrocks,et al.  Ontology Reasoning in the SHOQ(D) Description Logic , 2001, IJCAI.

[8]  Norbert Fuhr,et al.  MIND: An architecture for multimedia information retrieval in federated digital libraries , 2001 .

[9]  Edward A. Fox,et al.  MARIAN: Flexible Interoperability for Federated Digital Libraries , 2001, ECDL.

[10]  Gunter Saake,et al.  Federation services for heterogeneous digital libraries accessing cooperative and non-cooperative sources , 2000, Proceedings 2000 Kyoto International Conference on Digital Libraries: Research and Practice.

[11]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[12]  Chun-Nan Hsu,et al.  Induction of integrated view for XML data with heterogeneous DTDs , 2001, CIKM '01.

[13]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[14]  David W. Embley,et al.  Combining the Best of Global-as-View and Local-as-View for Data Integration , 2004, ISTA.

[15]  Norbert Fuhr Towards Data Abstraction in Networked Information Retrieval Systems , 1999, Inf. Process. Manag..

[16]  Joachim Biskup,et al.  Extracting information from heterogeneous information sources using ontologically specified target views , 2003, Inf. Syst..

[17]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[18]  Norbert Fuhr,et al.  Learning probabilistic datalog rules for information classification and transformation , 2001, CIKM '01.

[19]  Beat Wüthrich,et al.  On the learning of rule uncertainties and their integration into probabilistic knowledge bases , 1993, Journal of Intelligent Information Systems.