ADMIRE: an adaptive data model for meta search engines

Abstract Considering the diversity among search engines, efficient integration of them is an important but difficult job. It is essential to provide a data model that can provide a detailed description of the query capabilities of heterogeneous search engines. By means of this model, the meta-searcher can map users' queries into specific sources more accurately, and it can achieve good precision and recall. Moreover, it will benefit the selection of target source and computing priority. Because new search engines emerge frequently and old ones are updated when their function and content change, the data model needs good adaptivity and scalability to keep in step with the rapidly developing World Wide Web. This paper gives a formal description of the query capabilities of heterogeneous search engines and an algorithm for mapping a query from a general mediator format into the specific wrapper format of a specific search engine. Compared with related work, the special features of our work are that we focus more on the constraint of/between the terms, attribute order, and the impact of logical operator restraints. The contribution of our work is that we offer a data model that is both expressive enough to meticulously describe the query capabilities of current World Wide Web search engines and flexible enough to integrate them efficiently.

[1]  Michael Breu,et al.  Digital Libraries in Computer Science: The MeDoc Approach , 1998, Lecture Notes in Computer Science.

[2]  Christoph Schütte,et al.  THE UNICATS APPROACH - NEW MANAGEMENT FOR BOOKS IN THE INFORMATION MARKET , 1999 .

[3]  Nicholas Kushmerick,et al.  Regression testing for wrapper maintenance , 1999, AAAI/IAAI.

[4]  Craig A. Knoblock,et al.  Ariadne: a system for constructing mediators for Internet sources , 1998, SIGMOD '98.

[5]  Terry Winograd,et al.  SenseMaker: an information-exploration interface supporting the contextual evolution of a user's interests , 1997, CHI.

[6]  Carl Lagoze,et al.  Dienst: an architecture for distributed document libraries , 1995, CACM.

[7]  Terry Winograd,et al.  A GUI-Based Version of the SenseMaker Interface for Information Exploration , 1998 .

[8]  Jeffrey D. Ullman,et al.  Computing capabilities of mediators , 1999, SIGMOD '99.

[9]  Luis Gravano,et al.  Merging Ranks from Heterogeneous Internet Sources , 1997, VLDB.

[10]  Yannis Papakonstantinou,et al.  Describing and Using Query Capabilities of Heterogeneous Sources , 1997, VLDB.

[11]  Sibel Adali,et al.  A flexible architecture for query integration and mapping , 1998, Proceedings. 3rd IFCIS International Conference on Cooperative Information Systems (Cat. No.98EX122).

[12]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[13]  Gio Wiederhold,et al.  The INEEL Data Integration Mediation System , 1999 .

[14]  Roy Goldman,et al.  From Semistructured Data to XML: Migrating the Lore Data Model and Query Language , 1999, Markup Lang..

[15]  András Micsik,et al.  AQUA : An advanced user interface for the Dienst digital library system , 1998 .

[16]  William E. Moen,et al.  An Evaluation of the Federal Government's Implementation of the Government Information Locator Service (GILS): Final Report , 1997 .

[17]  C. Bufi,et al.  Integrated Search Engine , 1997, Proceedings 1997 IEEE Knowledge and Data Engineering Exchange Workshop.

[18]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[19]  Boris Chidlovskii,et al.  Boolean Query Translation for Brokerage on the Web , 1998 .

[20]  Luis Gravano,et al.  STARTS: Stanford Protocol Proposal for Internet Retrieval and Search , 1997 .

[21]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[22]  Carl Lagoze,et al.  Dienst: Implementation Reference Manual , 1995 .

[23]  Kevin Chen-Chuan Chang,et al.  Mind your vocabulary: query mapping across heterogeneous information sources , 1999, SIGMOD '99.

[24]  Adele E. Howe,et al.  Experiences with selecting search engines using metasearch , 1997, TOIS.

[25]  Bethina Schmitt,et al.  METALICA: An Enhanced Meta Search Engine for Literature Catalogs , 1999 .

[26]  Eliot Christian Application profile for the government information locator service (gils) , 1996 .

[27]  Chaitanya K. Baru,et al.  XML-based information mediation with MIX , 1999, SIGMOD '99.

[28]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[29]  Kevin Chen-Chuan Chang,et al.  Predicate rewriting for translating Boolean queries in a heterogeneous information system , 1999, TOIS.

[30]  Zachary G. Ives,et al.  An adaptive query execution engine for data integration , 1999 .

[31]  Kevin Chen-Chuan Chang,et al.  Boolean Query Mapping Across Heterogeneous Information Sources , 1996, IEEE Trans. Knowl. Data Eng..