Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web
暂无分享,去创建一个
[1] Luis Gravano,et al. QProber: A system for automatic classification of hidden-Web databases , 2003, TOIS.
[2] Luis Gravano,et al. Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection , 2002, VLDB.
[3] Tobias Dönz. Extracting Structured Data from Web Pages , 2003 .
[4] Jon M. Kleinberg,et al. Inferring Web communities from link topology , 1998, HYPERTEXT '98.
[5] Forouzan Golshani,et al. Proceedings of the Eighth International Conference on Data Engineering , 1992 .
[6] G. Karypis,et al. Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .
[7] King-Lup Liu,et al. Detection of heterogeneities in a multiple text database environment , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).
[8] James P. Callan,et al. Automatic discovery of language models for text databases , 1999, SIGMOD '99.
[9] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[10] Ziv Bar-Yossef,et al. Template detection via data mining and its applications , 2002, WWW.
[11] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[12] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[13] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.
[14] Michael K. Bergman. White Paper: The Deep Web: Surfacing Hidden Value , 2001 .
[15] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.
[16] G. Karypis,et al. Criterion functions for document clustering , 2005 .
[17] Inderjit S. Dhillon,et al. Efficient Clustering of Very Large Document Collections , 2001 .
[18] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[19] B. Huberman,et al. The Deep Web : Surfacing Hidden Value , 2000 .
[20] David G. Stork,et al. Pattern Classification , 1973 .
[21] H. V. Jagadish,et al. Evaluating Structural Similarity in XML Documents , 2002, WebDB.
[22] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .
[23] William W. Cohen. Recognizing Structure in Web Pages using Similarity Queries , 1999, AAAI/IAAI.
[24] Monika Henzinger,et al. Finding Related Pages in the World Wide Web , 1999, Comput. Networks.
[25] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.
[26] Dave Raggett. Clean Up Your Web Pages with HTML TIDY , 1999 .
[27] David Hawking,et al. Methods for information server selection , 1999, TOIS.
[28] Ravi Kumar,et al. Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.
[29] Doug Beeferman,et al. Agglomerative clustering of a search engine query log , 2000, KDD '00.
[30] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[31] Krishna Bharat,et al. Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.
[32] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..
[33] Oren Etzioni,et al. Web document clustering: a feasibility demonstration , 1998, SIGIR '98.