Non-Materialized Object Views of Web Datasources

Web datasources are mostly textual and often provide fulltext search options to access data. However accessing Web sources filling forms, browsing and reading is tedious. Most users are used to access and manipulate data with a rich query language and expect to query Web datasources the same way. To satisfy users, we propose an approach based on a nonmaterialized object view of Web datasources. Each query against the object view generates calls onto the Web datasource. Retrieved documents are parsed to extract data used to generate new calls or to build the output returned to the user. Unlike the warehouse approach where data is transfered and stored in a database system, we choose not to materialize the view in order to insure our users to access up-to-date data. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform Object Protocol Model (OPM) interfaces.