An auto-indexing technique for databases based on clustering

Considering the wide deployment of databases and its size, particularly in data warehouses, it is important to automate the physical design so that the task of the database administrator (DBA) is minimized. An important part of physical database design is index selection. An auto-index selection tool capable of analyzing large amounts of data and suggesting a good set of indexes for a database is the goal of auto-administration. Clustering is a data mining technique with broad appeal and usefulness in exploratory data analysis. This idea provides a motivation to apply clustering techniques to obtain good indexes for a workload in the database. We describe a technique for auto-indexing using clustering. The experiments conducted show that the proposed technique performs better than Microsoft SQL server index selection tool (1ST) and can suggest indexes faster than Microsoft's IST.

[1]  Herman Chernoff,et al.  Cluster Analysis for Applications (Michael R. Anderberg) , 1975 .

[2]  Paul Jensen,et al.  Microsoft SQL Server 2000unleashed , 2001 .

[3]  Le Gruenwald,et al.  Frequent itemsets mining for database auto-administration , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[4]  Daniel C. Zilio,et al.  DB2 advisor: an optimizer smart enough to recommend its own indexes , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[5]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[6]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[7]  Matteo Fischetti,et al.  Exact and Approximate Algorithms for the Index Selection Problem in Physical Database Design , 1995, IEEE Trans. Knowl. Data Eng..