A Meta-search Method with Clustering and Term Correlation

A meta-search engine propagates user queries to its participant search engines following a server selection strategy. To facilitate server selection, the meta-search engine must keep concise descriptors about the document collections indexed by the participant search engines. Most existing approaches record in the descriptors information about what terms appear in a document collection, but they skip information about which documents a keyword appears in. This results in ineffective server ranking for multi-term queries, because a document collection may contain all of the query terms but not all of the terms appear in the same document.

[1]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[2]  Dik Lun Lee,et al.  A meta-search method reinforced by cluster descriptors , 2001, Proceedings of the Second International Conference on Web Information Systems Engineering.

[3]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[4]  Adele E. Howe,et al.  SAVVYSEARCH: A Metasearch Engine That Learns Which Search Engines to Query , 1997, AI Mag..

[5]  Dik Lun Lee,et al.  An MDP-based peer-to-peer search server network , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..

[6]  Dik Lun Lee,et al.  Server Ranking for Distributed Text Retrieval Systems on the Internet , 1997, DASFAA.

[7]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[8]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[9]  Paul Francis,et al.  Ingrid: A Self-Configuring Information Navigation Infrastructure , 1996, World Wide Web J..

[10]  Clement T. Yu,et al.  Towards a highly-scalable and effective metasearch engine , 2001, WWW '01.

[11]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[12]  Marina Meila,et al.  An Experimental Comparison of Several Clustering and Initialization Methods , 1998, UAI.

[13]  Luis Gravano,et al.  GlOSS: text-source discovery over the Internet , 1999, TODS.

[14]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[15]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[16]  Martin Bergman,et al.  The deep web:surfacing the hidden value , 2000 .

[17]  Paul S. Bradley,et al.  Clustering via Concave Minimization , 1996, NIPS.

[18]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.