Applying syntactic similarity algorithms for enterprise information management
暂无分享,去创建一个
[1] Alan M. Frieze,et al. Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..
[2] Justin Zobel,et al. Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..
[3] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[4] Hector Garcia-Molina,et al. Finding near-replicas of documents on the Web , 1999 .
[5] Marc Najork,et al. Detecting phrase-level duplication on the world wide web , 2005, SIGIR '05.
[6] Bart Preneel,et al. Hash functions , 2005, Encyclopedia of Cryptography and Security.
[7] Paul Mackerras,et al. The rsync algorithm , 1996 .
[8] Monika Henzinger,et al. Finding near-duplicate web pages: a large-scale evaluation of algorithms , 2006, SIGIR.
[9] Val Henson,et al. An Analysis of Compare-by-hash , 2003, HotOS.
[10] George Forman,et al. Finding similar files in large document repositories , 2005, KDD '05.
[11] Hector Garcia-Molina,et al. Copy detection mechanisms for digital documents , 1995, SIGMOD '95.
[12] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[13] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[14] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[15] Hector Garcia-Molina,et al. Building a scalable and accurate copy detection mechanism , 1996, DL '96.
[16] Alan M. Frieze,et al. Min-wise independent permutations (extended abstract) , 1998, STOC '98.
[17] Hector Garcia-Molina,et al. Finding Near-Replicas of Documents and Servers on the Web , 1998, WebDB.
[18] Marc Najork,et al. On the evolution of clusters of near-duplicate Web pages , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).
[19] Udi Manber,et al. Finding Similar Files in a Large File System , 1994, USENIX Winter.