Web Content Extraction Using Clustering with Web Structure
暂无分享,去创建一个
Yan Gao | Xiaotao Huang | Yuhua Li | Fen Wang | Ling Kang | Liqun Huang | Zhizhao Zhang
[1] Yuan Li,et al. Content Extraction from Chinese Web Pages Based on Punctuations Distribution , 2012, 2012 International Conference on Computer Science and Service System.
[2] Wei-Ying Ma,et al. VIPS: a Vision-based Page Segmentation Algorithm , 2003 .
[3] Isabel F. Cruz,et al. Measuring Structural Similarity Among Web Documents: Preliminary Results , 1998, EP.
[4] Pabitra Mitra,et al. Extracting semantic structure of web documents using content and visual information , 2005, WWW '05.
[5] Veenu Mangat,et al. A novel approach for content extraction from web pages , 2014, 2014 Recent Advances in Engineering and Computational Sciences (RAECS).
[6] Lin Mao-song. An Extraction Algorithm of Chinese HTML Content Based on Similarity , 2010 .
[7] Zeng Li-fang. Content extraction technique for web pages based on HTML-tags , 2010 .
[8] Wei-Ying Ma,et al. Extracting Content Structure for Web Pages Based on Visual Representation , 2003, APWeb.
[9] Jie Chen,et al. Combining a segmentation-like approach and a density-based approach in content extraction , 2012 .
[10] Yan Guo,et al. ECON: An Approach to Extract Content from Web News Page , 2010, 2010 12th International Asia-Pacific Web Conference.
[11] Sachindra Joshi,et al. A bag of paths model for measuring structural similarity in Web documents , 2003, KDD '03.