Frequent Itemset Mining for Clustering Near Duplicate Web Documents
暂无分享,去创建一个
[1] Hector Garcia-Molina,et al. Finding replicated Web collections , 2000, SIGMOD '00.
[2] Sergei O. Kuznetsov,et al. Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..
[3] Bart Goethals,et al. Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.
[4] Christian Borgelt,et al. EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .
[5] Joshua Alspector,et al. Improved robustness of signature-based near-replica detection via lexicon randomization , 2004, KDD.
[6] Dan Klein,et al. Evaluating strategies for similarity search on the web , 2002, WWW '02.
[7] Jeffrey Xu Yu,et al. Efficient similarity joins for near-duplicate detection , 2011, TODS.
[8] Ahmad M. Hasnah,et al. A New Filtering Algorithm for Duplicate Document Based on Concept Analysis , 2006 .
[9] Derrick G. Kourie,et al. AddIntent: A New Incremental Algorithm for Constructing Concept Lattices , 2004, ICFCA.
[10] Bart Goethals,et al. Advances in Frequent Itemset Mining Implementations: Introduction to FIMI03 , 2003, FIMI.
[11] Bernhard Ganter,et al. Formal Concept Analysis: Mathematical Foundations , 1998 .
[12] Benno Stein,et al. New Issues in Near-duplicate Detection , 2007, GfKl.
[13] Bart Goethals,et al. Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations (FIMI'03) , 2003 .
[14] George Karypis,et al. Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering , 2004, Machine Learning.
[15] Hector Garcia-Molina,et al. Finding near-replicas of documents on the Web , 1999 .
[16] Andrei Z. Broder,et al. Identifying and Filtering Near-Duplicate Documents , 2000, CPM.
[17] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[18] Hongjun Lu,et al. AFOPT: An Efficient Implementation of Pattern Growth Approach , 2003, FIMI.
[19] Jean-François Boulicaut,et al. Constraint-Based Mining of Formal Concepts in Transactional Data , 2004, PAKDD.
[20] Eckart Zitzler,et al. BicAT: a biclustering analysis toolbox , 2006, Bioinform..
[21] Gösta Grahne,et al. Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.
[22] Alan M. Frieze,et al. Min-wise independent permutations (extended abstract) , 1998, STOC '98.
[23] Hector Garcia-Molina,et al. Finding Near-Replicas of Documents and Servers on the Web , 1998, WebDB.
[24] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[25] Ilya Valentinovich Segalovich,et al. An efficient method to detect duplicates of web documents with the use of inverted index , 2002, WWW 2002.
[26] Nicolas Pasquier,et al. Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..
[27] Johannes Gehrke,et al. MAFIA: A Performance Study of Mining Maximal Frequent Itemsets , 2003, FIMI.
[28] Alan M. Frieze,et al. Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..