Estimating corpus size via queries
暂无分享,去创建一个
Andrei Z. Broder | Rajeev Motwani | Marcus Fontoura | Andrew Tomkins | Vanja Josifovski | Rina Panigrahy | Ravi Kumar | Ying Xu | Shubha U. Nabar | V. Josifovski | A. Broder | R. Motwani | A. Tomkins | R. Panigrahy | Ravi Kumar | M. Fontoura | Ying Xu
[1] Andrei Z. Broder,et al. A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines , 1998, Comput. Networks.
[2] Giles,et al. Searching the world wide Web , 1998, Science.
[3] Andrei Z. Broder,et al. Mirror, Mirror on the Web: A Study of Host Pairs with Replicated Content , 1999, Comput. Networks.
[4] C. Lee Giles,et al. Accessibility of information on the web , 1999, Nature.
[5] Andrei Z. Broder,et al. A Comparison of Techniques to Find Mirrored Hosts on the WWW , 2000, IEEE Data Eng. Bull..
[6] Steve Chien,et al. Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.
[7] Marc Najork,et al. On near-uniform URL sampling , 2000, Comput. Networks.
[8] David M. Pennock,et al. Methods for Sampling Pages Uniformly from the World Wide Web , 2001 .
[9] King-Lup Liu,et al. Discovering the representative of a search engine , 2001, CIKM '01.
[10] James P. Callan,et al. Query-based sampling of text databases , 2001, TOIS.
[11] Shengli Wu,et al. Experiments with Document Archive Size Detection , 2003, ECIR.
[12] Antonio Gulli,et al. The indexable web is more than 11.5 billion pages , 2005, WWW '05.
[13] Andrei Z. Broder,et al. Sampling Search-Engine Results , 2005, WWW '05.
[14] Ziv Bar-Yossef,et al. Random sampling from a search engine's index , 2006, WWW '06.