Scheduling Queries to Improve the Freshness of a Website

The World Wide Web is a new advertising medium that corporations use to increase their exposure to consumers. Very large websites whose content is derived from a source database need to maintain a freshness that reflects changes that are made to the base data. This issue is particularly significant for websites that present fast-changing information such as stock-exchange information and product information. In this article, we formally define and study the freshness of a website that is refreshed by a scheduled set of queries that fetch fresh data from the databases. We propose several online-scheduling algorithms and compare the performance of the algorithms on the freshness metric. We show that maximizing the freshness of a website is a NP-hard problem and that the scheduling algorithm MiEF performs better than the other proposed algorithms. Our conclusion is verified by empirical results.

[1]  Hector Garcia-Molina,et al.  Synchronizing a database to improve freshness , 2000, SIGMOD '00.

[2]  Ee-Peng Lim,et al.  Keeping a Very Large Website Up-to-date: Some Feasibility Results , 2000, EC-Web.

[3]  Ee-Peng Lim,et al.  Model and research issues for refreshing a very large Website , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[4]  Jia Wang,et al.  A survey of web caching schemes for the Internet , 1999, CCRV.

[5]  J. Leung,et al.  A Note on Preemptive Scheduling of Periodic, Real-Time Tasks , 1980, Inf. Process. Lett..

[6]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[7]  Alon Itai,et al.  Maintenance of views , 1984, SIGMOD '84.

[8]  Aloysius Ka-Lau Mok,et al.  Fundamental design problems of distributed systems for the hard-real-time environment , 1983 .

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  Alexandros Labrinidis,et al.  WebView materialization , 2000, SIGMOD '00.

[11]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[12]  Joseph Y.-T. Leung,et al.  On the complexity of fixed-priority scheduling of periodic, real-time tasks , 1982, Perform. Evaluation.

[13]  H. M. Taylor,et al.  An introduction to stochastic modeling , 1985 .

[14]  Giuseppe Sindoni,et al.  Incremental Maintenance of Hypertext Views , 1998, WebDB.

[15]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[16]  Nam Huyn,et al.  Multiple-View Self-Maintenance in Data Warehousing Environments , 1997, VLDB.

[17]  Michael R. Frey,et al.  An Introduction to Stochastic Modeling (2nd Ed.) , 1994 .

[18]  John V. Harrison,et al.  Maintenance of Materialized Views in a Deductive Database: An Update Propagation Approach , 1992, Workshop on Deductive Databases, JICSLP.

[19]  Ee-Peng Lim,et al.  Query Integration for Refreshing Web Views , 2001, DEXA.

[20]  John A. Stankovic Strategic directions in real-time and embedded systems , 1996, CSUR.

[21]  Rodney R. Howell,et al.  On Non-Preemptive Scheduling of Recurring Tasks Using Inserted Idle Times , 1995, Inf. Comput..

[22]  Gio Wiederhold,et al.  Incremental Recomputation of Active Relational Expressions , 1991, IEEE Trans. Knowl. Data Eng..

[23]  Joseph Y.-T. Leung,et al.  On-Line Scheduling of Real-Time Tasks , 1992, IEEE Trans. Computers.

[24]  Henrik Loeser Keeping Web pages up-to-date with SQL:1999 , 2000, Proceedings 2000 International Database Engineering and Applications Symposium (Cat. No.PR00789).

[25]  Charles U. Martel,et al.  On non-preemptive scheduling of period and sporadic tasks , 1991, [1991] Proceedings Twelfth Real-Time Systems Symposium.