The SEWASIE Network of Mediator Agents for Semantic Search

Integration of heterogeneous information in the context of Internet becomes a key activity to enable a more organized and semantically meaningful access to data sources. As Internet can be viewed as a data-sharing network where sites are data sources, the challenge is twofold. Firstly, sources present information according to their particular view of the matter, i.e. each of them assumes a specific ontology. Then, data sources are usually isolated, i.e. they do not share any topological information concerning the content or the structure of other sources. The classical approach to solve these issues is provided by mediator systems which aim at creating a unified virtual view of the underlying data sources in order to hide the heterogeneity of data and give users a transparent access to the integrated information. In this paper we propose to use a multi-agent architecture to build and manage a mediators network. While a single peer (i.e. a mediator agent) independently carries out data integration activities, it exchanges knowledge with other peers by means of specialized agents (i.e. brokers) which provide a coherent access plan to access information in the peer network. This defines two layers in the system: at local level, peers maintain an integrated view of local sources, while at network level agents maintain mappings among the different peers. The result is the definition of a new networked mediator system intended to operate in web economies, which we realized in the SEWASIE (SEmantic Webs and AgentS in Integrated Economies) project. SEWASIE is a RDT project supported by the 5th Framework IST program of the European Community successfully ended on September 2005.

[1]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2]  Peter B. Danzig,et al.  The Harvest Information Discovery and Access System , 1995, Comput. Networks ISDN Syst..

[3]  Domenico Beneventano,et al.  Fi-nal release of the system prototype for query management , 2005 .

[4]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[5]  Felix Naumann,et al.  Declarative Data Merging with Conflict Resolution , 2002, ICIQ.

[6]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[7]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[8]  Pedro M. Domingos,et al.  Ontology Matching: A Machine Learning Approach , 2004, Handbook on Ontologies.

[9]  Gerd Stumme,et al.  FCA-MERGE: Bottom-Up Merging of Ontologies , 2001, IJCAI.

[10]  Jingren Zhou,et al.  View matching for outer-join views , 2006, The VLDB Journal.

[11]  Jussi Myllymaki Effective Web data extraction with standard XML technologies , 2001, WWW '01.

[12]  SALLY McCLEAN,et al.  Agents for Querying Distributed Statistical Databases Over the Internet , 2002, Int. J. Artif. Intell. Tools.

[13]  Domenico Beneventano,et al.  Full Outer Join Optimization Techniques in Integration Information Systems , 2006 .

[14]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[15]  Hamid Pirahesh,et al.  Canonical abstraction for outerjoin optimization , 2004, SIGMOD '04.

[16]  Silvana Castano,et al.  Global Viewing of Heterogeneous Data Sources , 2001, IEEE Trans. Knowl. Data Eng..

[17]  Silvana Castano,et al.  Semantic Self-Formation of Communities of Peers , 2005 .

[18]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[19]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[20]  Jan Chomicki,et al.  Query Answering in Inconsistent Databases , 2003, Logics for Emerging Applications of Databases.

[21]  Maurizio Lenzerini,et al.  Tackling inconsistencies in data integration through source preferences , 2004, IQIS '04.

[22]  Sergio Greco,et al.  A Logical Framework for Querying and Repairing Inconsistent Databases , 2003, IEEE Trans. Knowl. Data Eng..

[23]  Pedro M. Domingos,et al.  Learning to map between ontologies on the semantic web , 2002, WWW '02.

[24]  Gerd Stumme,et al.  Formal Concept Analysis: Theory and Applications , 2004, Journal of universal computer science (Online).

[25]  Laura M. Haas,et al.  The Clio project: managing heterogeneity , 2001, SGMD.

[26]  Alberto O. Mendelzon,et al.  Merging Databases Under Constraints , 1998, Int. J. Cooperative Inf. Syst..

[27]  Boris Motik,et al.  User-Driven Ontology Evolution Management , 2002, EKAW.

[28]  Georg Gottlob,et al.  Visual Web Information Extraction with Lixto , 2001, VLDB.

[29]  Rajeev Motwani,et al.  Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.

[30]  Hector Garcia-Molina,et al.  Semistructured Data: The Tsimmis Experience , 1997, ADBIS.

[31]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[32]  R. Heese,et al.  Humboldt Discoverer : A semantic P 2 P index for PDMS , 2005 .

[33]  Beneventano Domenico,et al.  Semantic search engines based on data integration systems , 2006 .

[34]  Michel C. A. Klein,et al.  Ontology versioning on the Semantic Web , 2001, SWWS.

[35]  Anand Rajaraman,et al.  Integrating Information by Outerjoins and Full Disjunctions , 1996, PODS 1996.

[36]  Divesh Srivastava,et al.  The Information Manifold , 1995 .

[37]  César A. Galindo-Legaria,et al.  Outerjoins as disjunctions , 1994, SIGMOD '94.

[38]  Jennifer Widom,et al.  The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.

[39]  Maurizio Vincini,et al.  Instances Navigation for Querying Integrated Data from Web-Sites , 2006, WEBIST.

[40]  W. Bruce Croft,et al.  Searching distributed collections with networks , 1995 .

[41]  Guillermo Ricardo Simari,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 2000 .

[42]  Mark A. Musen,et al.  PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.

[43]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[44]  Gerd Stumme,et al.  FCA-merge: a bottom-up approach for merging ontologies , 2001 .

[45]  Natalya F. Noy,et al.  Semantic integration: a survey of ontology-based approaches , 2004, SGMD.

[46]  Craig A. Knoblock,et al.  Learning object identification rules for information integration , 2001, Inf. Syst..

[47]  Surajit Chaudhuri,et al.  Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.

[48]  Maurizio Vincini,et al.  Synthesizing an Integrated Ontology , 2003, IEEE Internet Comput..

[49]  Arnon Rosenthal,et al.  Outerjoin simplification and reordering for query optimization , 1997, TODS.

[50]  Günter Neumann,et al.  An Information Extraction Core System for Real World German Text Processing , 1997, ANLP.

[51]  Erhard Rahm,et al.  Schema and ontology matching with COMA++ , 2005, SIGMOD '05.

[52]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.