Layout Based Information Extraction from HTML Documents
暂无分享,去创建一个
[1] Ian Jacobs,et al. Cascading Style Sheets, level 2 CSS2 Specification , 2008 .
[2] Wei-Ying Ma,et al. Visual Based Content Understanding towards Web Adaptation , 2002, AH.
[3] Wei-Ying Ma,et al. VIPS: a Vision-based Page Segmentation Algorithm , 2003 .
[4] Michael Gertz,et al. Reverse engineering for Web data: from visual to semantic structures , 2002, Proceedings 18th International Conference on Data Engineering.
[5] Dayne Freitag,et al. Information Extraction from HTML: Application of a General Machine Learning Approach , 1998, AAAI/IAAI.
[6] Gail E. Kaiser,et al. DOM-based content extraction of HTML documents , 2003, WWW '03.
[7] Yasuto Ishitani,et al. Document transformation system from papers to XML data based on pivot XML document method , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[8] Jean-Luc Meunier,et al. Optimized XY-cut for determining a page reading order , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).
[9] Keith L. Clark,et al. Using Grammatical Inference to Automate Information Extraction from the Web , 2001, PKDD.
[10] Baoyao Zhou,et al. Function-based object model towards website adaptation , 2001, WWW '01.