Online duplicate document detection: signature reliability in a dynamic retrieval environment
暂无分享,去创建一个
[1] C. A. Moser,et al. Facts from Figures. , 1953 .
[2] T. Asfour,et al. Facts & Figures , 1962, Contemporary Canadian Picture Books.
[3] William H. Press,et al. Numerical recipes in C. The art of scientific computing , 1987 .
[4] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .
[5] W. Bruce Croft,et al. Inference networks for document retrieval , 1989, SIGIR '90.
[6] Carmen Miller. Detecting duplicates: a searcher's dream come true , 1990 .
[7] Howard R. Turtle. Natural language vs. Boolean query evaluation: a comparison of retrieval performance , 1994, SIGIR '94.
[8] Paul Thompson,et al. TREC-3 Ad Hoc Retrieval and Routing Experiments using the WIN System , 1994, TREC.
[9] Udi Manber,et al. Finding Similar Files in a Large File System , 1994, USENIX Winter.
[10] Hector Garcia-Molina,et al. Copy detection mechanisms for digital documents , 1995, SIGMOD '95.
[11] Carol Tenopir,et al. TARGET and FREESTYLE: DIALOG and Mead join the relevance ranks , 1997 .
[12] Donna K. Harman,et al. Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..
[13] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[14] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[15] Hector Garcia-Molina,et al. Finding Near-Replicas of Documents and Servers on the Web , 1998, WebDB.
[16] Hector Garcia-Molina,et al. Finding near-replicas of documents on the Web , 1999 .
[17] Robert Wilensky,et al. Robust Hyperlinks: Cheap, Everywhere, Now , 2000, DDEP/PODDP.
[18] Ophir Frieder,et al. Efficiency Considerations for Scalable Information Retrieval Servers , 2006, J. Digit. Inf..
[19] James P. Callan,et al. Query-based sampling of text databases , 2001, TOIS.
[20] James W. Cooper,et al. Detecting similar documents using salient terms , 2002, CIKM '02.
[21] David M. Pennock,et al. Analysis of lexical signatures for finding lost or related documents , 2002, SIGIR '02.
[22] William H. Press,et al. Numerical recipes in C , 2002 .
[23] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.