Near-duplicate detection for eRulemaking
暂无分享,去创建一个
[1] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.
[2] Hector Garcia-Molina,et al. Copy detection mechanisms for digital documents , 1995, SIGMOD '95.
[3] Hector Garcia-Molina,et al. Duplicate Removal in Information Dissemination , 1998 .
[4] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[5] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[6] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[7] Hector Garcia-Molina,et al. Finding near-replicas of documents on the Web , 1999 .
[8] Justin Zobel,et al. Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..
[9] Michalis Vazirgiannis,et al. On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.
[10] Stuart W. Shulman. An experiment in digital government at the United States National Organic Program , 2003 .
[11] Hector Garcia-Molina,et al. Finding Near-Replicas of Documents and Servers on the Web , 1998, WebDB.
[12] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[13] Jack G. Conrad,et al. Constructing a text corpus for inexact duplicate detection , 2004, SIGIR '04.
[14] Jack G. Conrad,et al. Online duplicate document detection: signature reliability in a dynamic retrieval environment , 2003, CIKM '03.