论文信息 - Clustering hypertext with applications to web searching - 字舞流文

Clustering hypertext with applications to web searching

A method and structure of searching a database containing hypertext documents comprising searching the database using a query to produce a set of hypertext documents; and geometrically clustering the set of hypertext documents into various clusters using a toric k-means similarity measure such that documents within each cluster are similar to each other, wherein the clustering has a linear-time complexity in producing the set of hypertext documents, wherein the similarity measure comprises a weighted sum of maximized individual components of the set of hypertext documents, and wherein the clustering is based upon words contained in each hypertext document, out-links from each hypertext document, and in-links to each hypertext document.

W. Scott Spangler | Dharmendra S. Modha | D. Modha | W. Spangler

[1] Hans-Peter Frei,et al. Making use of hypertext links when retrieving information , 1992, ECHT '92.

[2] Paul S. Bradley,et al. Refining Initial Points for K-Means Clustering , 1998, ICML.

[3] Ricardo Baeza-Yates,et al. Information Retrieval: Data Structures and Algorithms , 1992 .

[4] Man Hon Wong,et al. Web Document Classification based on Hyperlinks and Document Semantics , 2000, PRICAI Workshop on Text and Web Mining.

[5] Ray R. Larson,et al. Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace , 1996 .

[6] Henry G. Small,et al. Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[7] Gerard Salton,et al. Associative Document Retrieval Techniques Using Bibliographic Information , 1963, JACM.

[8] Jon M. Kleinberg,et al. Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[9] W. Bruce Croft,et al. A retrieval model incorporating hypertext links , 1989, Hypertext.

[10] Chaomei Chen. Structuring and visualising the WWW by generalised similarity analysis , 1997, HYPERTEXT '97.

[11] Jon Kleinberg,et al. Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[12] Piotr Indyk,et al. Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[13] Kui-Lam Kwok,et al. A probabilistic theory of indexing and similarity measure based on cited and citing documents , 1985, J. Am. Soc. Inf. Sci..

[14] Sougata Mukherjea,et al. Organizing topic-specific web information , 2000, HYPERTEXT '00.

[15] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[16] L. R. Rasmussen,et al. In information retrieval: data structures and algorithms , 1992 .

[17] Sougata Mukherjea,et al. Interactive clustering for navigating in hypermedia systems , 1994, ECHT '94.

[18] Chanathip Namprempre,et al. HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering , 1996, HYPERTEXT '96.

[19] Peter Willett,et al. Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[20] Mary Czerwinski,et al. From latent semantics to spatial hypertext—an integrated approach , 1998, HYPERTEXT '98.

[21] Vincent Kanade,et al. Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[22] Alan F. Smeaton,et al. A Connectivity Analysis Approach to Increasing Precision in Retrieval From Hyperlinked Documents , 1999, TREC.

[23] Craig Silverstein,et al. Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[24] Loren G. Terveen,et al. Constructing, organizing, and visualizing collections of topically related Web resources , 1999, TCHI.

[25] Giles,et al. Searching the world wide Web , 1998, Science.

[26] Ramana Rao,et al. Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[27] Rodrigo A. Botafogo. Cluster analysis for hypertext systems , 1993, SIGIR.