论文信息 - Improving query response time in scientific databases using data aggregation -a case study

Improving query response time in scientific databases using data aggregation -a case study

Although most state-of-the-art database systems have no inherent limitations w.r.t. the amount of data they can handle, the huge data quantities typically found in scientific database applications often exceed the feasibility level from a practical point of view when query performance is the issue. One theoretically well-known concept of improving query response time in scientific database applications is using the categorization and classification facilities often found in scientific computing domains for storing data aggregations that allow to substitute expensive access to raw data by the use of stored aggregated values. The results of an empirical performance study carried out in the application domain of market research are presented which substantiate the practical importance of such work. Using real market research data, it is shown that query response time can be shortened in an order of magnitude if a proper data aggregation concept is used. If the data aggregates are designed properly, the overhead of generating and managing materializations of data aggregates is by far outweighed by the improved query performance in realistic scenarios.

[1] Surajit Chaudhuri,et al. Maintenance of Materialized Views: Problems, Techniques, and Applications. , 1995 .

[2] V. S. Subrahmanian,et al. Maintaining views incrementally , 1993, SIGMOD Conference.

[3] Jennifer Widom,et al. Deriving Production Rules for Incremental View Maintenance , 1991, VLDB.

[4] Frank Wm. Tompa,et al. Efficiently updating materialized views , 1986, SIGMOD '86.

[5] Jennifer Widom,et al. Research problems in data warehousing , 1995, CIKM '95.

[6] Arie Shoshani,et al. Statistical and Scientific Database Issues , 1985, IEEE Transactions on Software Engineering.

[7] Goetz Graefe,et al. Multi-table joins through bitmapped join indices , 1995, SGMD.