A Supervised Machine Learning Approach for Duplicate Detection over Gazetteer Records
暂无分享,去创建一个
[1] Javier M. Moguerza,et al. Support Vector Machines with Applications , 2006, math/0612817.
[2] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[3] Lawrence Philips,et al. The double metaphone search algorithm , 2000 .
[4] Linda L. Hill. Georeferencing - The Geographic Associations of Information , 2009, Digital libraries and electronic publishing.
[5] Jeffrey Xu Yu,et al. Efficient similarity joins for near duplicate detection , 2008, WWW.
[6] Xing Xie,et al. Detecting nearly duplicated records in location datasets , 2010, GIS '10.
[7] Dekang Lin,et al. An Information-Theoretic Definition of Similarity , 1998, ICML.
[8] Lise Getoor,et al. GeoDDupe: A Novel Interface for Interactive Entity Resolution in Geospatial Data , 2007, 2007 11th International Conference Information Visualization (IV '07).
[9] Yoav Freund,et al. The Alternating Decision Tree Learning Algorithm , 1999, ICML.
[10] Catriel Beeri,et al. Object Fusion in Geographic Information Systems , 2004, VLDB.
[11] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.
[12] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[13] Panagiotis G. Ipeirotis,et al. Duplicate Record Detection: A Survey , 2007 .
[14] Felix Naumann,et al. An Introduction to Duplicate Detection , 2010, An Introduction to Duplicate Detection.
[15] Linda L. Hill. Georeferencing: The Geographic Associations of Information (Digital Libraries and Electronic Publishing) , 2006 .
[16] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[17] Stuart J. Russell,et al. Identity Uncertainty and Citation Matching , 2002, NIPS.
[18] Raghav Kaushik,et al. Efficient exact set-similarity joins , 2006, VLDB.
[19] Raymond J. Mooney,et al. Adaptive Blocking: Learning to Scale Up Record Linkage , 2006, Sixth International Conference on Data Mining (ICDM'06).
[20] Clodoveu A. Davis,et al. Approximate String Matching for Geographic Names and Personal Names , 2007, GEOINFO.
[21] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[22] Craig A. Knoblock,et al. Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.
[23] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .
[24] J. T. Hastings,et al. Automated conflation of digital gazetteer data , 2008, Int. J. Geogr. Inf. Sci..
[25] William E. Winkler,et al. Methods for Record Linkage and Bayesian Networks , 2002 .
[26] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[27] Yu Deng,et al. Finding Similar Objects Using a Taxonomy: A Pragmatic Approach , 2006, OTM Conferences.
[28] Pradeep Ravikumar,et al. A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.
[29] David A. Landgrebe,et al. A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..
[30] Lise Getoor,et al. Entity resolution in geospatial data integration , 2006, GIS '06.
[31] Philip Resnik,et al. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..
[32] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[33] Ashok Samal,et al. A feature-based approach to conflation of geospatial sources , 2004, Int. J. Geogr. Inf. Sci..
[34] Linda L. Hill,et al. Core Elements of Digital Gazetteers: Placenames, Categories, and Footprints , 2000, ECDL.
[35] Mikhail Bilenko and Raymond J. Mooney,et al. On Evaluation and Training-Set Construction for Duplicate Detection , 2003 .
[36] B. Schölkopf,et al. Advances in kernel methods: support vector learning , 1999 .
[37] William W. Cohen,et al. Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.
[38] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[39] Charles Elkan,et al. The Field Matching Problem: Algorithms and Applications , 1996, KDD.