Mining for Information Discovery on the Web: Overview and Illustrative Research
暂无分享,去创建一个
Jiawei Han | AnHai Doan | Hwanjo Yu | A. Doan | Jiawei Han | Hwanjo Yu
[1] Amihai Motro,et al. Database Schema Matching Using Machine Learning with Feature Selection , 2002, CAiSE.
[2] Nicholas Kushmerick,et al. Wrapper Induction for Information Extraction , 1997, IJCAI.
[3] Chih-Jen Lin,et al. Training nu-Support Vector Classifiers: Theory and Algorithms , 2001, Neural Comput..
[4] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[5] Joseph M. Hellerstein,et al. Eddies:Continuous Query Optimization , 1999, SIGMOD 2000.
[6] Jian Pei,et al. CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[7] Deborah L. McGuinness,et al. The Chimaera Ontology Environment , 2000, AAAI/IAAI.
[8] Arnon Rosenthal,et al. Data Integration Needs an Industrial Revolution , 2001 .
[9] Joann J. Ordille,et al. Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.
[10] Kevin Chen-Chuan Chang,et al. PEBL: positive example based learning for Web page classification using SVM , 2002, KDD.
[11] Erhard Rahm,et al. A survey of approaches to automatic schema matching , 2001, The VLDB Journal.
[12] David W. Embley,et al. Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration , 2001, Workshop on Information Integration on the Web.
[13] Tom M. Mitchell,et al. Discovering Test Set Regularities in Relational Domains , 2000, ICML.
[14] Salvatore J. Stolfo,et al. The merge/purge problem for large databases , 1995, SIGMOD '95.
[15] Dan Roth,et al. Probabilistic Reasoning for Entity & Relation Recognition , 2002, COLING.
[16] Alberto O. Mendelzon,et al. Database techniques for the World-Wide Web: a survey , 1998, SGMD.
[17] Alon Y. Halevy,et al. An adaptive query execution system for data integration , 1999, SIGMOD '99.
[18] Sourav S. Bhowmick,et al. Research Issues in Web Data Mining , 1999, DaWaK.
[19] William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.
[20] Gerhard Weikum,et al. The BINGO! System for Information Portal Generation and Expert Web Search , 2003, CIDR.
[21] Jeffrey F. Naughton,et al. On schema matching with opaque column names and data values , 2003, SIGMOD '03.
[22] Jiawei Han,et al. Object Matching for Information Integration: A Profiler-Based Approach , 2003, IIWeb.
[23] Daphne Koller,et al. Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.
[24] Pedro M. Domingos,et al. Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.
[25] Oren Etzioni,et al. Fast and Intuitive Clustering of Web Documents , 1997, KDD.
[26] Subbarao Kambhampati,et al. Optimizing Recursive Information-Gathering Plans , 1999, IJCAI.
[27] Luis Gravano,et al. Text joins for data cleansing and integration in an RDBMS , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[28] Kevin Chen-Chuan Chang,et al. Statistical schema matching across web query interfaces , 2003, SIGMOD '03.
[29] Chris Clifton,et al. SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..
[30] Felix Naumann,et al. Attribute classification using feature analysis , 2002, Proceedings 18th International Conference on Data Engineering.
[31] Joseph M. Hellerstein,et al. Potter's Wheel: An Interactive Data Cleaning System , 2001, VLDB.
[32] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.
[33] Charles Elkan,et al. The Field Matching Problem: Algorithms and Applications , 1996, KDD.
[34] Amihai Motro,et al. Autoplex: Automated Discovery of Content for Virtual Databases , 2001, CoopIS.
[35] David W. Embley,et al. Record-boundary discovery in Web documents , 1999, SIGMOD '99.
[36] Tom M. Mitchell,et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.
[37] Craig A. Knoblock,et al. Wrapper generation for semi-structured Internet sources , 1997, SGMD.
[38] Surajit Chaudhuri,et al. Eliminating Fuzzy Duplicates in Data Warehouses , 2002, VLDB.
[39] C. Lee Giles,et al. Autonomous citation matching , 1999, AGENTS '99.
[40] Malik Yousef,et al. One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..
[41] David J. DeWitt,et al. NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.
[42] Piotr Indyk,et al. Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.
[43] Chaomei Chen,et al. Mining the Web: Discovering knowledge from hypertext data , 2004, J. Assoc. Inf. Sci. Technol..
[44] William W. Cohen,et al. Learning to Match and Cluster Entity Names , 2001 .
[46] Nicholas Kushmerick,et al. Wrapper verification , 2000, World Wide Web.
[47] Andrew McCallum,et al. A Machine Learning Approach to Building Domain-Specific Search Engines , 1999, IJCAI.
[48] James P. Callan,et al. Automatic discovery of language models for text databases , 1999, SIGMOD '99.
[49] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.
[50] Prasenjit Mitra,et al. Semi-automatic Integration of Knowledge Sources , 1999 .
[51] William W. Cohen,et al. A flexible learning system for wrapping tables and lists in HTML documents , 2002, WWW.
[52] Gideon S. Mann,et al. Analyses for elucidating current question answering technology , 2001, Natural Language Engineering.
[53] Prabhakar Raghavan,et al. Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.
[54] Erhard Rahm,et al. Generic Schema Matching with Cupid , 2001, VLDB.
[55] Jennifer Neville,et al. Iterative Classification in Relational Data , 2000 .
[56] Andrew McCallum,et al. Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.
[57] Jennifer Widom,et al. The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.
[58] C. Lee Giles,et al. CiteSeer: an automatic citation indexing system , 1998, DL '98.
[59] Laura M. Haas,et al. Data-driven understanding and refinement of schema mappings , 2001, SIGMOD '01.
[60] Mark A. Musen,et al. PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.
[61] Jiawei Han,et al. Data Mining for Web Intelligence , 2002, Computer.
[62] Susan T. Dumais,et al. Bringing order to the Web: automatically categorizing search results , 2000, CHI.
[63] Luis Gravano,et al. Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.
[64] Daniel Kudenko,et al. Transferring and Retraining Learned Information Filters , 1997, AAAI/IAAI.
[65] Pedro M. Domingos,et al. Learning to map between ontologies on the semantic web , 2002, WWW '02.
[66] R. Mooney,et al. Learning to Combine Trained Distance Metrics for Duplicate Detection in Databases , 2002 .
[67] Tova Milo,et al. Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.
[68] Erhard Rahm,et al. Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.
[69] Soumen Chakrabarti,et al. Data mining for hypertext: a tutorial survey , 2000, SKDD.
[70] Hendrik Blockeel,et al. Web mining research: a survey , 2000, SKDD.
[71] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[72] Thorsten Joachims,et al. Text categorization with support vector machines , 1999 .
[73] Craig A. Knoblock,et al. Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.
[74] Dennis Shasha,et al. An extensible Framework for Data Cleaning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).
[75] Chih-Jen Lin,et al. Training v-Support Vector Classifiers: Theory and Algorithms , 2001, Neural Computation.
[76] Alexandros Ntoulas,et al. Effective Change Detection Using Sampling , 2002, VLDB.
[77] Robert P. W. Duin,et al. Support vector domain description , 1999, Pattern Recognit. Lett..
[78] Andrew McCallum,et al. Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.
[79] Oren Etzioni,et al. Web document clustering: a feasibility demonstration , 1998, SIGIR '98.
[80] DoanAnHai,et al. Learning to match ontologies on the Semantic Web , 2003, VLDB 2003.
[81] Laura M. Haas,et al. Optimizing Queries Across Diverse Data Sources , 1997, VLDB.
[82] Mitesh Patel,et al. Structured databases on the web: observations and implications , 2004, SGMD.
[83] Sebastian Thrun,et al. Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.
[84] Tom M. Mitchell,et al. Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..
[85] Frann Cois Denis,et al. PAC Learning from Positive Statistical Queries , 1998, ALT.
[86] Craig A. Knoblock,et al. Wrapper Maintenance: A Machine Learning Approach , 2011, J. Artif. Intell. Res..
[87] Erhard Rahm,et al. COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.
[88] Jennifer Widom,et al. The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.
[89] Rémi Gilleron,et al. Learning from positive and unlabeled examples , 2000, Theor. Comput. Sci..
[90] Rémi Gilleron,et al. Positive and Unlabeled Examples Help Learning , 1999, ALT.
[91] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[92] Erhard Rahm,et al. On Matching Schemas Automatically , 2001 .
[93] Robert P. W. Duin,et al. Uniform Object Generation for Optimizing One-class Classifiers , 2002, J. Mach. Learn. Res..
[94] Hwanjo Yu. SVMC: Single-Class Classification With Support Vector Machines , 2003, IJCAI.
[95] Daniel A. Keim,et al. On Knowledge Discovery and Data Mining , 1997 .
[96] P. Schönemann. On artificial intelligence , 1985, Behavioral and Brain Sciences.
[97] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[98] Mark A. Musen,et al. Promptdiff: a fixed-point algorithm for comparing ontology versions , 2002, AAAI/IAAI.
[99] Susan T. Dumais,et al. Hierarchical classification of Web content , 2000, SIGIR '00.
[100] Philip S. Yu,et al. Partially Supervised Classification of Text Documents , 2002, ICML.
[101] Luigi Palopoli,et al. Semi-automatic, semantic discovery of properties from database schemes , 1998, Proceedings. IDEAS'98. International Database Engineering and Applications Symposium (Cat. No.98EX156).
[102] Dayne Freitag,et al. Multistrategy Learning for Information Extraction , 1998, ICML.
[103] Hans Chalupsky,et al. OntoMorph: A Translation System for Symbolic Knowledge , 2000, KR.
[104] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[105] Tom M. Mitchell,et al. Using unlabeled data to improve text classification , 2001 .