A graph-based approach for extracting terminological properties of elements of XML documents

XML is rapidly becoming a standard for information exchange over the Web. Web providers and applications using XML for representing and exchanging their data make their information available in such a way that interoperability can be easily reached. However in order to guarantee both the exchange of XML documents and the interoperability between information providers, it is often needed to single out semantic similarity properties relating concepts of different XML documents. This paper gives a contribution to this framework by proposing a technique for extracting synonymies and homonymies. The derivation technique is based on a rich conceptual model (called SDR-Network) which is used to represent concepts expressed in XML documents as well as the semantic relationships holding among them.

[1]  Luigi Palopoli,et al.  A unified graph-based framework for deriving nominal interscheme properties, type conflicts and object cluster similarities , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[2]  ZVI GALIL,et al.  Efficient algorithms for finding maximum matching in graphs , 1986, CSUR.

[3]  Filippo Furfaro Querying semi-structured data with graph grammars , 2002, Proceedings. International Conference on Information Technology: Coding and Computing.

[4]  Roy Goldman,et al.  From Semistructured Data to XML: Migrating the Lore Data Model and Query Language , 1999, Markup Lang..

[5]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Silvana Castano,et al.  Semantic dictionary design for database interoperability , 1997, Proceedings 13th International Conference on Data Engineering.

[7]  Yang Wen Semantic integration of structured and semistructured data sources , 2002 .

[8]  Erich J. Neuhold,et al.  Semantic vs. structural resemblance of classes , 1991, SGMD.

[9]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[10]  Letizia Tanca,et al.  XML-GL: A Graphical Language for Querying and Restructuring XML Documents , 1999, SEBD.

[11]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[12]  Luigi Palopoli,et al.  Intensional and extensional integration and abstraction of heterogeneous databases , 2000, Data Knowl. Eng..

[13]  Letizia Tanca,et al.  XML-GL: A Graphical Language for Querying and Reshaping XML Documents , 1998, QL.

[14]  Dan Suciu,et al.  Semistructured Data and XML , 2001, FODO.

[15]  Giorgio Terracina,et al.  Deriving synonymies and homonymies of object classes in semi-structured information sources , 2000 .