Exploiting content redundancy for web information extraction
暂无分享,去创建一个
[1] Yida Wang,et al. Incorporating site-level knowledge to extract structured data from web forums , 2009, WWW '09.
[2] Eugene Agichtein,et al. Mining reference tables for automatic text segmentation , 2004, KDD.
[3] Bing Liu,et al. Web data extraction based on partial tree alignment , 2005, WWW '05.
[4] W. A. Beyer,et al. Some Biological Sequence Metrics , 1976 .
[5] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.
[6] Anuradha Bhamidipaty,et al. Interactive deduplication using active learning , 2002, KDD.
[7] Luis Gravano,et al. Text joins in an RDBMS for web data integration , 2003, WWW '03.
[8] Wei-Ying Ma,et al. Simultaneous record detection and attribute labeling in web data extraction , 2006, KDD '06.
[9] Surajit Chaudhuri,et al. A Primitive Operator for Similarity Joins in Data Cleaning , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[10] Sergey Brin,et al. Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.
[11] William W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.
[12] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.
[13] Rajeev Motwani,et al. Robust and efficient fuzzy match for online data cleaning , 2003, SIGMOD '03.
[14] Craig A. Knoblock,et al. Hierarchical Wrapper Induction for Semistructured Information Sources , 2004, Autonomous Agents and Multi-Agent Systems.
[15] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[16] Andrew Tomkins,et al. The volume and evolution of web page templates , 2005, WWW '05.
[17] Sunita Sarawagi,et al. Automatic segmentation of text into structured records , 2001, SIGMOD '01.
[18] Panagiotis G. Ipeirotis,et al. Duplicate Record Detection: A Survey , 2007 .
[19] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[20] Luis Gravano,et al. Snowball: extracting relations from large plain-text collections , 2000, DL '00.
[21] Daniel P. Lopresti,et al. Block Edit Models for Approximate String Matching , 1997, Theor. Comput. Sci..
[22] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[23] Divesh Srivastava,et al. Record linkage: similarity measures and algorithms , 2006, SIGMOD Conference.
[24] Nicholas Kushmerick,et al. Wrapper Induction for Information Extraction , 1997, IJCAI.