On the evolution of clusters of near-duplicate Web pages
暂无分享,去创建一个
[1] Monika Henzinger,et al. Finding Related Pages in the World Wide Web , 1999, Comput. Networks.
[2] Proceedings. First Latin American Web Congress , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).
[3] Jeffrey D. Ullman,et al. Set Merging Algorithms , 1973, SIAM J. Comput..
[4] Marc Najork,et al. A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..
[5] Alan M. Frieze,et al. Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..
[6] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[7] Marc Najork,et al. High-performance Web Crawling High-performance Web Crawling Publication History , 2001 .
[8] Andrei Z. Broder,et al. A Comparison of Techniques to Find Mirrored Hosts on the WWW , 2000, IEEE Data Eng. Bull..
[9] Andrei Z. Broder,et al. Mirror, Mirror on the Web: A Study of Host Pairs with Replicated Content , 1999, Comput. Networks.
[10] Udi Manber,et al. Finding Similar Files in a Large File System , 1994, USENIX Winter.
[11] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.
[12] Hector Garcia-Molina,et al. Finding replicated Web collections , 2000, SIGMOD '00.
[13] A. Broder. Some applications of Rabin’s fingerprinting method , 1993 .