Box clustering segmentation: A new method for vision-based web page preprocessing
暂无分享,去创建一个
[1] Reda Alhajj,et al. Effectiveness of template detection on noise reduction and websites summarization , 2013, Inf. Sci..
[2] Lejian Liao,et al. A hybrid approach for content extraction with text density and visual importance of DOM nodes , 2013, Knowledge and Information Systems.
[3] Wei-Ying Ma,et al. VIPS: a Vision-based Page Segmentation Algorithm , 2003 .
[4] Mie Mie Su Thwin,et al. Web Page Segmentation and Informative Content Extraction for Effective Information Retrieval , 2014 .
[5] Liang Liu,et al. An Improved VIPS-Based Algorithm of Extracting Web Content , 2014 .
[6] Yu-Chieh Wu. Language independent web news extraction system based on text detection framework , 2016, Inf. Sci..
[7] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.
[8] Tingting Wei,et al. Web page segmentation based on the hough transform and vision cues , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).
[9] Shinde Santaji Krishna,et al. Schema inference and data extraction from templatized Web pages , 2015, 2015 International Conference on Pervasive Computing (ICPC).
[10] Chengcui Zhang,et al. An FAR-SW based approach for webpage information extraction , 2014, Inf. Syst. Frontiers.
[11] Jer Lang Hong,et al. Information extraction for search engines using fast heuristic techniques , 2010, Data Knowl. Eng..
[12] Chengfei Liu,et al. AutoRM: An effective approach for automatic Web data record mining , 2015, Knowl. Based Syst..
[13] Pavlina Fragkou. Information Extraction versus Text Segmentation for Web Content Mining , 2013, Int. J. Softw. Eng. Knowl. Eng..
[14] Andres Sanoja,et al. Block-o-Matic: A web page segmentation framework , 2014, 2014 International Conference on Multimedia Computing and Systems (ICMCS).
[15] Radek Burget,et al. Cluster-based page segmentation-a fast and precise method for web page pre-processing , 2013, WIMS '13.
[16] Ravi Kumar,et al. Automatic Wrappers for Large Scale Web Extraction , 2011, Proc. VLDB Endow..
[17] Hayri Volkan Agun,et al. Web content extraction by using decision tree learning , 2012, 2012 20th Signal Processing and Communications Applications Conference (SIU).
[18] Ye Tian,et al. Segmenting Webpage with Gomory-Hu Tree Based Clustering , 2011, J. Softw..
[19] M. Elgin Akpinar,et al. Vision Based Page Segmentation Algorithm: Extended and Perceived Success , 2013, ICWE Workshops.
[20] Keishi Tajima,et al. Extracting Logical Hierarchical Structure of HTML Documents Based on Headings , 2015, Proc. VLDB Endow..
[21] Samiran Chattopadhyay,et al. Mobile-enabled content adaptation system for e-learning websites using segmentation algorithm , 2014, The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014).
[22] Bo Gao,et al. Multiple Template Detection Based on Segments , 2014, ICDM.
[23] Michael Cormier,et al. Purely vision-based segmentation of web pages for assistive technology , 2016, Comput. Vis. Image Underst..
[24] Abhay Sharma,et al. Understanding Color Management , 2003 .
[25] Pierre Beust,et al. A Hybrid Segmentation of Web Pages for Vibro-Tactile Access on Touch-Screen Devices , 2014, VL@COLING.
[26] Radek Burget. Layout Based Information Extraction from HTML Documents , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).
[27] Ashish Kumar. Software Architecture Styles a Survey , 2014 .
[28] Stefan Conrad,et al. Page segmentation by web content clustering , 2011, WIMS '11.
[29] Yeliz Yesilada,et al. Vision Based Page Segmentation: Extended and Improved Algorithm , 2014 .
[30] Steven Pemberton,et al. Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification , 2010 .
[31] Lidong Bing,et al. Web page segmentation with structured prediction and its application in web page classification , 2014, SIGIR.
[32] Zhen Xu,et al. Identifying semantic blocks in Web pages using Gestalt laws of grouping , 2016, World Wide Web.
[33] Radek Burget. Visual Area Classification for Article Identification in Web Documents , 2010, 2010 Workshops on Database and Expert Systems Applications.
[34] Hayri Volkan Agun,et al. A hybrid approach for extracting informative content from web pages , 2013, Inf. Process. Manag..
[35] Jun Zeng,et al. A Web Page Segmentation Approach Using Visual Semantics , 2014, IEICE Trans. Inf. Syst..
[36] Wei-Ying Ma,et al. Improving pseudo-relevance feedback in web information retrieval using web page segmentation , 2003, WWW '03.
[37] Yang Song,et al. Extracting news content with visual unit of web pages , 2015, 2015 IEEE/ACIS 16th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).
[38] Zhong-Liang Xiang,et al. Wrapper induction of news information for feeding to social networking service on smartphone , 2015, 2015 17th International Conference on Advanced Communication Technology (ICACT).
[39] Panagiotis Papapetrou,et al. Extracting news text from web pages: an application for the visually impaired , 2015, PETRA.
[40] Hassan F. Eldirdiery,et al. Detecting and Removing Noisy Data on Web Document using Text Density Approach , 2015 .
[41] Jun Kong,et al. Web Interface Interpretation Using Graph Grammars , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[42] B M Patil,et al. Template Extraction from Heterogeneous Web Pages with Cosine Similarity , 2014 .
[43] Wei Liu,et al. ViDE: A Vision-Based Approach for Deep Web Data Extraction , 2010, IEEE Transactions on Knowledge and Data Engineering.
[44] G. Potdar,et al. Template Extraction from Heterogeneous Web Pages , 2012 .
[45] Clément de Groc,et al. Mining Product Features from the Web: A Self-supervised Approach , 2012, WEBIST.
[46] Kun Jiang,et al. Noise Reduction of Web Pages via Feature Analysis , 2015, 2015 2nd International Conference on Information Science and Control Engineering.
[47] Dhaval Patel,et al. Removing Noise Content from Online News Articles , 2014, COMAD.
[48] L. Hubert,et al. Comparing partitions , 1985 .
[49] Claudio Feijóo,et al. Emerging Perspectives on the Mobile Content Evolution , 2015 .
[50] Radek Burget,et al. Information Extraction from Web Sources Based on Multi-aspect Content Analysis , 2015, SemWebEval@ESWC.
[51] David A. Bell,et al. Extracting Data Records from Query Result Pages Based on Visual Features , 2011, BNCOD.
[52] David A. Bell,et al. Automatically Annotating Structured Web Data Using a SVM-Based Multiclass Classifier , 2014, WISE.
[53] Hassan F. Eldirdiery,et al. Web Document Segmentation for Better Extraction of Information: A Review , 2015 .
[54] Salvador Tamarit,et al. TeMex: The Web Template Extractor , 2015, WWW.
[55] Donato Malerba,et al. Extracting general lists from web documents: a hybrid approach , 2011, IEA/AIE'11.