Efficient top-k count queries over imprecise duplicates
暂无分享,去创建一个
Sunita Sarawagi | Vinay S. Deshpande | Sourabh Kasliwal | Sunita Sarawagi | Vinay S. Deshpande | Sourabh Kasliwal
[1] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[2] Surajit Chaudhuri,et al. Example-driven design of efficient record matching queries , 2007, VLDB.
[3] Kevin Chen-Chuan Chang,et al. Probabilistic top-k and ranking-aggregate queries , 2008, TODS.
[4] Sugato Basu,et al. Adaptive product normalization: using online learning for record linkage in comparison shopping , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).
[5] Rajeev Motwani,et al. Robust identification of fuzzy duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).
[6] Daphne Koller. Structured Probabilistic Models: Bayesian Networks and Beyond , 1998, AAAI/IAAI.
[7] Raymond J. Mooney,et al. Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).
[8] Sunita Sarawagi,et al. Efficient set joins on similarity predicates , 2004, SIGMOD '04.
[9] David Harel,et al. A Multi-scale Algorithm for the Linear Arrangement Problem , 2002, WG.
[10] Ihab F. Ilyas,et al. A survey of top-k query processing techniques in relational database systems , 2008, CSUR.
[11] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[12] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[13] Ivan P. Fellegi,et al. A Theory for Record Linkage , 1969 .
[14] Venkatesan Guruswami,et al. Clustering with qualitative information , 2005, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[15] William W. Cohen,et al. Learning to Match and Cluster Entity Names , 2001 .
[16] Santosh S. Vempala,et al. A divide-and-merge methodology for clustering , 2005, PODS '05.
[17] Stuart J. Russell,et al. Identity Uncertainty and Citation Matching , 2002, NIPS.
[18] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .
[19] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[20] Renée J. Miller,et al. Clean Answers over Dirty Databases: A Probabilistic Approach , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[21] Pradeep Ravikumar,et al. Adaptive Name Matching in Information Integration , 2003, IEEE Intell. Syst..
[22] William W. Cohen. Data integration using similarity joins and a word-based information representation language , 2000, TOIS.
[23] Andrew McCallum,et al. A unified approach for schema matching, coreference and canonicalization , 2008, KDD.
[24] Sunita Sarawagi,et al. Scaling up the ALIAS Duplicate Elimination System. , 2003, ICDE 2003.
[25] Avrim Blum,et al. Correlation Clustering , 2004, Machine Learning.
[26] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[27] Alon Y. Halevy,et al. Data integration with uncertainty , 2007, The VLDB Journal.
[28] Bin Wang,et al. Cost-based variable-length-gram selection for string collections to support approximate queries efficiently , 2008, SIGMOD Conference.
[29] Andrew McCallum,et al. Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.
[30] Mikhail Bilenko,et al. Learnable Similarity Functions and their Applications to Clustering and Record Linkage , 2004, AAAI.
[31] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[32] Lise Getoor,et al. Collective entity resolution in relational data , 2007, TKDD.
[33] Kevin Chen-Chuan Chang,et al. Supporting ad-hoc ranking aggregates , 2006, SIGMOD Conference.
[34] Teruko Mitamura,et al. Language-independent Probabilistic Answer Ranking for Question Answering , 2007, ACL.
[35] Alon Y. Halevy,et al. Bootstrapping pay-as-you-go data integration systems , 2008, SIGMOD Conference.
[36] Charles Elkan,et al. The Field Matching Problem: Algorithms and Applications , 1996, KDD.