Managing Web-Based Data: Database Models and Transformations

The paper considers the Araneus data model which employs database techniques and wrappers to extract data from and generate Web sites. The project features a logical model that abstracts physical aspects of Web sites. Araneus provides high-level descriptions of pages that let us both extract data from the Web and generate Web sites from databases.

[1]  Michael Stonebraker,et al.  The Asilomar report on database research , 1998, SGMD.

[2]  Hector Garcia-Molina,et al.  Extracting Semistructured Information from the Web. , 1997 .

[3]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[4]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[5]  Guido Moerkotte,et al.  Querying documents in object databases , 1997, International Journal on Digital Libraries.

[6]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[7]  Paolo Merialdo,et al.  Data-Intensive Web Sites: Design and Maintenance , 2001, World Wide Web.

[8]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[9]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[10]  Hector Garcia-Molina,et al.  Template-based wrappers in the TSIMMIS system , 1997, SIGMOD '97.

[11]  Alberto O. Mendelzon,et al.  WebOQL: restructuring documents, databases, and webs , 1999 .

[12]  Stefano Ceri,et al.  Web Modeling Language (WebML): a modeling language for designing Web sites , 2000, Comput. Networks.

[13]  Valter Crescenzi,et al.  Grammars Have Exceptions , 1998, Inf. Syst..

[14]  Paolo Paolini,et al.  Model-driven development of Web applications: the AutoWeb system , 2000, TOIS.

[15]  David Konopnicki,et al.  W3QS: A Query System for the World-Wide Web , 1995, VLDB.

[16]  Craig A. Knoblock,et al.  Wrapper generation for semi-structured Internet sources , 1997, SGMD.

[17]  Nicholas Kushmerick Wrapper induction: Efficiency and expressiveness (Extended abstract) , 1998 .

[18]  Paolo Merialdo,et al.  To Weave the Web , 1997, VLDB.

[19]  David Konopnicki,et al.  Information gathering in the World-Wide Web: the W3QL query language and the W3QS system , 1998, TODS.

[20]  Georg Gottlob,et al.  Visual Web Information Extraction with Lixto , 2001, VLDB.

[21]  Dan Suciu,et al.  A query language and optimization techniques for unstructured data , 1996, SIGMOD '96.