论文信息 - Information Extraction from Web Pages Based on Their Visual Representation

Information Extraction from Web Pages Based on Their Visual Representation

This research is dedicated to enhancing the efficiency of web information extraction and web accessibility. The motivation behind the research, its aim and objectives are presented, and the performed work on developing web page model for information extraction is described. We also present work on making extracted information accessible to blind users, providing them with the means to navigate and access required information quickly. We also present our ongoing research on creating efficient methods and approaches for information extraction from the proposed model. There are two main approaches considered: 1) development of the library which provides required functionality to the programmer; 2) development of declarative Datalog-like language for information extraction.

Ruslan R. Fayzrakhmanov

[1] Eliseo Clementini,et al. Qualitative Representation of Positional Information , 1997, Artif. Intell..

[2] Ruslan R. Fayzrakhmanov,et al. A unified ontology-based web page model for improving accessibility , 2010, WWW '10.

[3] Vipul Kashyap,et al. The Semantic Web - Semantics for Data and Services on the Web , 2008, Data-Centric Systems and Applications.

[4] Georg Gottlob,et al. The Lixto data extraction project: back and forth between theory and practice , 2004, PODS.

[5] Ruslan R. Fayzrakhmanov,et al. Modelling web navigation with the user in mind , 2010, W4A.

[6] Ruslan R. Fayzrakhmanov,et al. Web 2.0 Vision for the Blind , 2010 .

[7] Jun Kong,et al. Spatial graph grammars for graphical user interfaces , 2006, TCHI.

[8] Anthony G. Cohn,et al. Qualitative Spatial Representation and Reasoning Techniques , 1997, KI.

[9] Frank Wolter,et al. Monodic fragments of first-order temporal logics: 2000-2001 A.D , 2001, LPAR.

[10] Georg Gottlob,et al. The Elog Web Extraction Language , 2001, LPAR.