Structured databases on the web

The Web has been rapidly "deepened" by the prevalence of databases online. With the potentially unlimited information hidden behind their query interfaces, this "deep Web" of searchable databses is...

[1]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[2]  James P. Callan,et al.  Automatic discovery of language models for text databases , 1999, SIGMOD '99.

[3]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[4]  Oren Etzioni,et al.  Query routing for Web search engines: architecture and experiments , 2000, Comput. Networks.

[5]  Jiawei Han,et al.  Discovering complex matchings across web query interfaces: a correlation mining approach , 2004, KDD.

[6]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[7]  David W. Embley,et al.  Record-boundary discovery in Web documents , 1999, SIGMOD '99.

[8]  Luis Gravano,et al.  Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.

[9]  David Hawking,et al.  Methods for information server selection , 1999, TOIS.

[10]  William W. Cohen Some Practical Observations on Integration of Web Information , 1999, WebDB.

[11]  C. Lee Giles,et al.  Accessibility of information on the web , 1999, Nature.

[12]  Jeffrey D. Ullman,et al.  MedMaker: a mediation system based on declarative specifications , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[13]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[14]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[15]  Kevin Chen-Chuan Chang,et al.  Understanding Web query interfaces: best-effort parsing with hidden syntax , 2004, SIGMOD '04.

[16]  Ling Liu,et al.  Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web , 2004, Proceedings. 20th International Conference on Data Engineering.

[17]  Bertram Ludäscher,et al.  Modeling Interactive Web Sources for Information Mediation , 1999, ER.

[18]  Kevin Chen-Chuan Chang,et al.  Statistical schema matching across web query interfaces , 2003, SIGMOD '03.

[19]  Laura M. Haas,et al.  The Clio project: managing heterogeneity , 2001, SGMD.

[20]  Luis Gravano,et al.  Merging Ranks from Heterogeneous Internet Sources , 1997, VLDB.

[21]  King-Lup Liu,et al.  Determining Text Databases to Search in the Internet , 1998, VLDB.

[22]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[23]  Marti A. Hearst Trends & Controversies: Information integration , 1998, IEEE Intell. Syst..