论文信息 - An index selection method without repeated optimizer estimations

An index selection method without repeated optimizer estimations

The index selection problem (ISP) concerns the selection of an appropriate index set to minimize the total cost for a given workload containing read and update queries. Since the ISP has been proven to be an NP-hard problem, most studies focus on heuristic algorithms to obtain approximate solutions. However, even approximate algorithms still consume a large amount of computing time and disk space because these systems must record all query statements and frequently request from the database optimizers the cost estimation of each query in each considered index. This study proposes a novel algorithm without repeated optimizer estimations. When a query is delivered to a database system, the optimizer evaluates the costs of various query plans and chooses an access path for the query. The information from the evaluation stage is aggregated and recorded with limited space. The proposed algorithm can recommend indexes according to the readily available information without querying the optimizer again. The proposed algorithm was tested in a PostgreSQL database system using TPC-H data. Experimental results show the effectiveness of the proposed approach.

I-En Liao | Kuo-Fong Kao

[1] Vijay V. Raghavan,et al. On the Selection of an Optimal Set of Indexes , 1983, IEEE Transactions on Software Engineering.

[2] Yishai A. Feldman,et al. A knowledge-based approach for index selection in relational databases , 2003, Expert Syst. Appl..

[3] Goetz Graefe,et al. Query evaluation techniques for large databases , 1993, CSUR.

[4] Zhen He,et al. Cost modeling of spatial operators using non-parametric regression , 2007, Inf. Sci..

[5] Jeffrey D. Ullman,et al. Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[6] Matteo Fischetti,et al. Exact and Approximate Algorithms for the Index Selection Problem in Physical Database Design , 1995, IEEE Trans. Knowl. Data Eng..

[7] Elena Barcucci,et al. Index Selection in a Distributed Data Base , 1984, DDSS.

[8] Eugene Wong,et al. Query processing in a system for distributed databases (SDD-1) , 1981, TODS.

[9] Egon Balas,et al. An Algorithm for Large Zero-One Knapsack Problems , 1980, Oper. Res..

[10] Elena Barcucci,et al. Index Selection in Relational Databases , 1989, MFDBS.

[11] Philip S. Yu,et al. On Index Selection Schemes for Nested Object Hierarchies , 1994, VLDB.