Web Page Noise Reduction Algorithm Using Non-template Approach

[1]  Zhenghong Liu,et al.  Algorithm Research for the Noise of Information Extraction Based Vision and DOM Tree , 2009, 2009 International Symposium on Intelligent Ubiquitous Computing and Education.

[2]  Liu Dongfei,et al.  Research in Identification and Purification of the Bilingual Web Page , 2008, 2008 ISECS International Colloquium on Computing, Communication, Control, and Management.

[3]  Logrank: a clickstream-based web page importance metric for web crawlers , 2012 .

[4]  Zhang Weiwei,et al.  A Web Information Extraction Method Based on Ontology , 2012 .

[5]  Jihua Song,et al.  Web Content Information Extraction Approach Based on Removing Noise and Content-Features , 2010, 2010 International Conference on Web Information Systems and Mining.

[6]  B. Rothenburger,et al.  A Comparison of Dimensionality Reduction Techniques for Web Structure Mining , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[7]  Bin Fan,et al.  Web Page Classification Based on a Least Square Support Vector Machine with Latent Semantic Analysis , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[8]  Deepayan Chakrabarti,et al.  Page-level template detection via isotonic smoothing , 2007, WWW '07.

[9]  Xiaofeng Wang,et al.  A Novel Data Purification Algorithm Based on Outlier Mining , 2009, 2009 Ninth International Conference on Hybrid Intelligent Systems.