Detecting and Removing Noisy Data on Web Document using Text Density Approach
暂无分享,去创建一个
[1] Yeliz Yesilada,et al. Vision Based Page Segmentation: Extended and Improved Algorithm , 2014 .
[2] Eduardo Sany Laber,et al. A fast and simple method for extracting relevant content from news webpages , 2009, CIKM.
[3] Wolfgang Nejdl,et al. A densitometric approach to web page segmentation , 2008, CIKM '08.
[5] A. K. Singh,et al. An Efficient Method of Eliminating Noisy Information in Web Pages for Data Mining , 2004, CIT.
[6] Ziv Bar-Yossef,et al. Template detection via data mining and its applications , 2002, WWW.
[7] Amit Dutta,et al. Noise Elimination from Web Page Based on Regular Expressions for Web Content Mining , 2014 .
[8] Efstathios Stamatatos,et al. Extracting informative textual parts from web pages containing user-generated content , 2012, i-KNOW '12.
[9] Vijay Katiyar,et al. A Noise Reduction Approach based on n x 1 Table and XSL Display Method for Efficient Web Data Extraction , 2013 .
[10] Juliana Freire,et al. A fast and robust method for web page template detection and removal , 2006, CIKM '06.
[11] Andrew Tomkins,et al. The volume and evolution of web page templates , 2005, WWW '05.
[12] Sandip Debnath,et al. Automatic extraction of informative blocks from webpages , 2005, SAC '05.
[13] Wei-Ying Ma,et al. Learning block importance models for web pages , 2004, WWW '04.