Finding Near-Replicas of Documents and Servers on the Web
暂无分享,去创建一个
[1] Luis Gravano,et al. Merging Ranks from Heterogeneous Internet Sources , 1997, VLDB.
[2] Udi Manber,et al. GLIMPSE: A Tool to Search Through Entire File Systems , 1994, USENIX Winter.
[3] Hector Garcia-Molina,et al. SCAM: A Copy Detection Mechanism for Digital Documents , 1995, DL.
[4] Rajeev Motwani,et al. Computing Iceberg Queries Efficiently , 1998, VLDB.
[5] Hector Garcia-Molina,et al. Building a scalable and accurate copy detection mechanism , 1996, DL '96.
[6] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[7] Hector Garcia-Molina,et al. Copy detection mechanisms for digital documents , 1995, SIGMOD '95.
[8] Alon Y. Halevy,et al. Using Probabilistic Information in Data Integration , 1997, VLDB.
[9] Jon Kleinberg,et al. Authoritative sources in a hyperlinked environment , 1999, SODA '98.
[10] Udi Manber,et al. Finding Similar Files in a Large File System , 1994, USENIX Winter.
[11] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[12] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[13] H. Garcia-Molina,et al. Computing Iceberg Queries E ciently , 1998 .