Duplicate detection in XML data
暂无分享,去创建一个
[1] Felix Naumann,et al. Automatic Data Fusion with HumMer , 2005, VLDB.
[2] Tok Wang Ling,et al. A knowledge-based approach for duplicate elimination in data cleaning , 2001, Inf. Syst..
[3] Bernard Rous,et al. The ACM digital library , 2001, CACM.
[4] Felix Naumann,et al. Relationship-Based Duplicate Detection , 2006 .
[5] Lise Getoor,et al. A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.
[6] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.
[7] Kyuseok Shim,et al. Query Optimization in the Presence of Foreign Functions , 1993, VLDB.
[8] Felix Naumann,et al. A Duplicate Detection Benchmark for XML ( and Relational ) Data , 2006 .
[9] Hector Garcia-Molina,et al. Duplicate Removal in Information Dissemination , 1998 .
[10] Pedro M. Domingos,et al. Entity Resolution with Markov Logic , 2006, Sixth International Conference on Data Mining (ICDM'06).
[11] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[12] Lluís A. Belanche Muñoz,et al. Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[13] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[14] Mikhail Bilenko and Raymond J. Mooney,et al. On Evaluation and Training-Set Construction for Duplicate Detection , 2003 .
[15] Felix Naumann,et al. XML Duplicate Detection Using Sorted Neighborhoods , 2006, EDBT.
[16] Pradeep Ravikumar,et al. A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.
[17] Tiziana Catarci,et al. Structure-aware XML Object Identification , 2006, IEEE Data Eng. Bull..
[18] Byung-Won On,et al. Effective and scalable solutions for mixed and split citation problems in digital libraries , 2005, IQIS '05.
[19] Altigran Soares da Silva,et al. Finding similar identities among objects from multiple web sources , 2003, WIDM '03.
[20] Felix Naumann,et al. XStruct: Efficient Schema Extraction from Multiple and Large XML Documents , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).
[21] Ronald L. Rivest,et al. Introduction to Algorithms, Second Edition , 2001 .
[22] Wei-Ying Ma,et al. Object-level Vertical Search , 2007, CIDR.
[23] Dallan Quass,et al. Record Linkage for Genealogical Databases , 2003 .
[24] Hamid Pirahesh,et al. Extending XQuery for analytics , 2005, SIGMOD '05.
[25] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[26] Peter Fankhauser,et al. Unsupervised Duplicate Detection Using Sample Non-duplicates , 2006, J. Data Semant..
[27] H. V. Jagadish,et al. Evaluating Structural Similarity in XML Documents , 2002, WebDB.
[28] W. Winkler. Overview of Record Linkage and Current Research Directions , 2006 .
[29] Peter Fankhauser,et al. A Precise Blocking Method for Record Linkage , 2005, DaWaK.
[30] John Mylopoulos,et al. Representing and querying data transformations , 2005, 21st International Conference on Data Engineering (ICDE'05).
[31] Dennis Shasha,et al. Declaratively Cleaning your Data with AJAX , 2000, BDA.
[32] P. Ivax,et al. A THEORY FOR RECORD LINKAGE , 2004 .
[33] Felix Naumann,et al. DogmatiX tracks down duplicates in XML , 2005, SIGMOD '05.
[34] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[35] Andrew McCallum,et al. Object Consolodation by Graph Partitioning with a Conditionally›Trained Distance Metric , 2003 .
[36] Ioana Manolescu,et al. Declarative XML Data Cleaning with XClean , 2007, CAiSE.
[37] Peter Christen,et al. A Comparison of Fast Blocking Methods for Record Linkage , 2003, KDD 2003.
[38] Hector Garcia-Molina,et al. Generic Entity Resolution with Data Confidences , 2006, CleanDB.
[39] Rajeev Motwani,et al. Robust identification of fuzzy duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).
[40] Jiawei Han,et al. Profile-Based Object Matching for Information Integration , 2003, IEEE Intell. Syst..
[41] Renée J. Miller,et al. ConQuer: efficient management of inconsistent databases , 2005, SIGMOD '05.
[42] Pedro M. Domingos. Multi-Relational Record Linkage , 2003 .
[43] Salvatore J. Stolfo,et al. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.
[44] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[45] Philip S. Yu,et al. LinkClus: efficient clustering via heterogeneous semantic links , 2006, VLDB.
[46] Charles Elkan,et al. An Efficient Domain-Independent Algorithm for Detecting Approximately Duplicate Database Records , 1997, DMKD.
[47] Dmitri V. Kalashnikov,et al. Exploiting relationships for object consolidation , 2005, IQIS '05.
[48] William W. Cohen,et al. Contextual search and name disambiguation in email using graphs , 2006, SIGIR.
[49] Felix Naumann,et al. Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies , 2006, IEEE Data Eng. Bull..
[50] Felix Naumann,et al. Informationsintegration - Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen , 2006 .
[51] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.
[52] Jayant R. Haritsa,et al. Analyzing Plan Diagrams of Database Query Optimizers , 2005, VLDB.
[53] Felix Naumann,et al. Schema matching using duplicates , 2005, 21st International Conference on Data Engineering (ICDE'05).
[54] Luis Gravano,et al. Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.
[55] H B NEWCOMBE,et al. Automatic linkage of vital records. , 1959, Science.
[56] William E. Winkler,et al. The State of Record Linkage and Current Research Problems , 1999 .
[57] Lise Getoor,et al. Iterative record linkage for cleaning and integration , 2004, DMKD '04.
[58] Surajit Chaudhuri,et al. Data cleaning in microsoft SQL server 2005 , 2005, SIGMOD '05.
[59] Ahmed K. Elmagarmid,et al. TAILOR: a record linkage toolbox , 2002, Proceedings 18th International Conference on Data Engineering.
[60] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[61] Ilaria Bartolini,et al. String Matching with Metric Trees Using an Approximate Distance , 2002, SPIRE.
[62] Matthew A. Jaro,et al. Probabilistic linkage of large public health data files. , 1995, Statistics in medicine.
[63] Pradeep Ravikumar,et al. Adaptive Name Matching in Information Integration , 2003, IEEE Intell. Syst..
[64] Pedro M. Domingos,et al. Object Identification with Attribute-Mediated Dependences , 2005, PKDD.
[65] Stuart J. Russell,et al. Identity Uncertainty and Citation Matching , 2002, NIPS.
[66] Kaizhong Zhang,et al. Exact and approximate algorithms for unordered tree matching , 1994, IEEE Trans. Syst. Man Cybern..
[67] Chen Li,et al. Efficient record linkage in large data sets , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..
[68] Jeremy A. Hylton,et al. Identifying and Merging Related Bibliographic Records , 1996 .
[69] Hans-Peter Kriegel,et al. Efficient Similarity Search for Hierarchical Data in Large Databases , 2004, EDBT.
[70] Christopher D. Manning,et al. Using Feature Conjunctions Across Examples for Learning Pairwise Classifiers , 2004, ECML.
[71] Terence John Parr,et al. ANother Tool for Language Recognition , 2005 .
[72] Jaideep Srivastava,et al. Entity identification in database integration , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.
[73] Yannis Papakonstantinou,et al. Object Fusion in Mediator Systems , 1996, VLDB.
[74] Dmitri V. Kalashnikov,et al. Domain-independent data cleaning via analysis of entity-relationship graph , 2006, TODS.
[75] Dennis Shasha,et al. Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.
[76] Felix Naumann,et al. Detecting Duplicates in Complex XML Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[77] Raymond J. Mooney,et al. Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).
[78] Jayant Madhavan,et al. Reference reconciliation in complex information spaces , 2005, SIGMOD '05.
[79] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[80] Jennifer Widom,et al. Database systems - the complete book (international edition) , 2002 .
[81] Pavel Berkhin,et al. A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.
[82] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[83] Carlo Batini,et al. Data Quality: Concepts, Methodologies and Techniques , 2006, Data-Centric Systems and Applications.
[84] Felix Naumann,et al. Declarative Data Fusion - Syntax, Semantics, and Implementation , 2005, ADBIS.
[85] Raghu Ramakrishnan,et al. DBLife: A Community Information Management Platform for the Database Research Community (Demo) , 2007, CIDR.
[86] Joseph M. Hellerstein,et al. Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.
[87] Pradeep Ravikumar,et al. A Hierarchical Graphical Model for Record Linkage , 2004, UAI.
[88] Elliotte Rusty Harold,et al. XML in a Nutshell , 2001 .