WISE-Integrator: An Automatic Integrator of Web Search Interfaces for E-Commerce

More and more databases are becoming Web accessible through form-based search interfaces, and many of these sources are E-commerce sites. Providing a unified access to multiple E-commerce search engines selling similar products is of great importance in allowing users to search and compare products from multiple sites with ease. One key task for providing such a capability is to integrate the Web interfaces of these E-commerce search engines so that user queries can be submitted against the integrated interface. Currently, integrating such search interfaces is carried out either manually or semi-automatically, which is inefficient and difficult to maintain. In this paper, we present WISE-Integrator - a tool that performs automatic integration of Web Interfaces of Search Engines. WISE-Integrator employs sophisticated techniques to identify matching attributes from different search interfaces for integration. It also resolves domain differences of matching attributes. Our experimental results based on 20 and 50 interfaces in two different domains indicate that WISE-Integrator can achieve high attribute matching accuracy and can produce high-quality integrated search interfaces without human interactions.

[1]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[2]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[3]  Kevin Chen-Chuan Chang,et al.  Statistical schema matching across web query interfaces , 2003, SIGMOD '03.

[4]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[5]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[6]  F. Guerra,et al.  SI-Designer: an Integration Framework for E-Commerce , 2001 .

[7]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..

[8]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[9]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[10]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[11]  Bodo Rieger,et al.  Semantic Integration of Heterogeneous Information Sources , 2000, EFIS.

[12]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[13]  Michael R. Genesereth,et al.  Infomaster: an information integration system , 1997, SIGMOD '97.

[14]  William W. Cohen Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  James A. Larson,et al.  A Theory of Attribute Equivalence in Databases with Application to Schema Integration , 1989, IEEE Trans. Software Eng..

[17]  Clement T. Yu,et al.  Clustering e-commerce search engines , 2004, WWW Alt. '04.

[18]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[19]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[20]  Kevin Chen-Chuan Chang,et al.  Statistical Schema Integration across the Deep Web , 2002 .