Data Curation at Scale: The Data Tamer System
暂无分享,去创建一个
Michael Stonebraker | Stanley B. Zdonik | Daniel Bruckner | Ihab F. Ilyas | George Beskales | Mitch Cherniack | Alexander Pagan | Shan Xu | M. Stonebraker | S. Zdonik | Mitch Cherniack | I. Ilyas | G. Beskales | Alexander Pagan | Sha Xu | D. Bruckner
[1] Daisy Zhe Wang,et al. WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..
[2] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[3] Peter Christen,et al. Febrl - Freely extensible biomedical record linkage , 2002 .
[4] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.
[5] W. Winkler. Overview of Record Linkage and Current Research Directions , 2006 .
[6] Claire Mathieu,et al. Online Correlation Clustering , 2010, STACS.
[7] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[8] Phokion G. Kolaitis,et al. Semi-Automatic Schema Integration in Clio , 2007, VLDB.
[9] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[10] Joseph M. Hellerstein,et al. Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.
[11] Peter Christen,et al. A Comparison of Fast Blocking Methods for Record Linkage , 2003, KDD 2003.
[12] Rajeev Motwani,et al. Robust identification of fuzzy duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).