Sampling dirty data for matching attributes
暂无分享,去创建一个
Shazia Wasim Sadiq | Henning Köhler | Xiaofang Zhou | Kerry L. Taylor | Yanfeng Shu | K. Taylor | Xiaofang Zhou | S. Sadiq | Y. Shu | Henning Köhler
[1] Peter J. Haas,et al. The New Jersey Data Reduction Report , 1997 .
[2] Anthony K. H. Tung,et al. Validating Multi-column Schema Matchings by Type , 2008, 2008 IEEE 24th International Conference on Data Engineering.
[3] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[4] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[5] Yossi Matias,et al. Bifocal sampling for skew-resistant join size estimation , 1996, SIGMOD '96.
[6] Joann J. Ordille,et al. Data integration: the teenage years , 2006, VLDB.
[7] William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.
[8] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[9] David Maier,et al. From databases to dataspaces: a new abstraction for information management , 2005, SGMD.
[10] Udi Manber,et al. Finding Similar Files in a Large File System , 1994, USENIX Winter.
[11] Felix Naumann,et al. Efficiently Detecting Inclusion Dependencies , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[12] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[13] Doron Rotem,et al. Random sampling from databases: a survey , 1995 .
[14] Rajeev Motwani,et al. On random sampling over joins , 1999, SIGMOD '99.
[15] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[16] Andrei Z. Broder,et al. Identifying and Filtering Near-Duplicate Documents , 2000, CPM.
[17] Viswanath Poosala,et al. Congressional samples for approximate answering of group-by queries , 2000, SIGMOD '00.
[18] AnHai Doan,et al. iMAP: Discovering Complex Mappings between Database Schemas. , 2004, SIGMOD 2004.
[19] Doron Rotem,et al. Simple Random Sampling from Relational Databases , 1986, VLDB.
[20] Bin Wang,et al. VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams , 2007, VLDB.
[21] Theodore Johnson,et al. Mining database structure; or, how to build a data quality browser , 2002, SIGMOD '02.
[22] Calisto Zuzarte,et al. Query sampling in DB2 Universal Database , 2004, SIGMOD '04.
[23] Surajit Chaudhuri,et al. Effective use of block-level sampling in statistics estimation , 2004, SIGMOD '04.
[24] David Maier,et al. Principles of dataspace systems , 2006, PODS '06.
[25] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.
[26] Peter J. Haas,et al. A bi-level Bernoulli scheme for database sampling , 2004, SIGMOD '04.
[27] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[28] Laura M. Haas,et al. Schema Mapping as Query Discovery , 2000, VLDB.
[29] Pedro M. Domingos,et al. iMAP: discovering complex semantic matches between database schemas , 2004, SIGMOD '04.
[30] Charles F. Hockett,et al. A mathematical theory of communication , 1948, MOCO.