Improving Web usability by categorizing information

Modern browsers allow users to search and navigate the vast amount of Web data, but the significant problem of extracting desired information from such data still remains, mainly due to the lack of an explicit structure both in Web pages and sites. We present an approach to Web structuring in which both Web pages and sites are considered. In particular, we analyze the structure and semantics, aiming at highlight (possibly hidden) structural and semantic organization, therefore building an explicit logical schema, which improves Web usability (browsing, searching) and designing.

[1]  Erich J. Neuhold,et al.  Jedi: extracting and synthesizing information from the Web , 1998, Proceedings. 3rd IFCIS International Conference on Cooperative Information Systems (Cat. No.98EX122).

[2]  Vincenza Carchiolo,et al.  Hidden Schema Extraction in Web Documents , 2003, DNIS.

[3]  Hector Garcia-Molina,et al.  Extracting Semistructured Information from the Web. , 1997 .

[4]  Serge Abiteboul,et al.  Extracting schema from semistructured data , 1998, SIGMOD '98.

[5]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[6]  Vincenza Carchiolo,et al.  Structuring the Web , 2000, Proceedings 11th International Workshop on Database and Expert Systems Applications.

[7]  Paolo Merialdo,et al.  To Weave the Web , 1997, VLDB.

[8]  Dan Smith,et al.  Information extraction for semi-structured documents , 1997 .

[9]  Dan Suciu,et al.  Catching the boat with Strudel: experiences with a Web-site management system , 1998, SIGMOD '98.

[10]  Stefano Paraboschi,et al.  Design principles for data-intensive Web sites , 1999, SGMD.

[11]  Brad Adelberg,et al.  NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents , 1998, SIGMOD '98.