A stochastic approach for clustering in object bases

Object clustering has long been recognized as important to the performance of object bases, but in most work to date, it is not clear exactly what is being optimized or how optimal are the solutions obtained. We give a rigorous treatment of a fundamental problem in clustering: given an object base and a probabilistic description of the expected access patterns, what is an optimal obJect clustering, and how can this optimal clustering be found or approximated? We present a system model for the clustering problem and discuss two models for access patterns in the system. For the first, exact optimal clustering strategies can be found; for the second, we show that the problem is NP-complete, but that it is an instance of a well-studied graph partitioning problem. We propose a new clustering algorithm based upon Kernighan’s heuristic graph partitioning algorithm, and present a preliminary experimental comparison of this new clustering algorithm with several previously proposed clustering algorithms.

[1]  Scott D. Carson,et al.  A system for adaptive disk rearrangement , 1990, Softw. Pract. Exp..

[2]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[3]  Stanley B. Zdonik,et al.  A shared, segmented memory system for an object-oriented database , 1987, TOIS.

[4]  Arnold O. Allen,et al.  Probability, statistics and queueing theory - with computer science applications (2. ed.) , 1981, Int. CMG Conference.

[5]  Roger King,et al.  The Performance and Utility of the Cactis Implementation Algorithms , 1990, VLDB.

[6]  Michael J. Carey,et al.  Persistence in the E Language: Issues and Implementation , 1989, Softw. Pract. Exp..

[7]  Chak-Kuen Wong,et al.  On the Optimality of the Probability Ranking Scheme in Storage Applications , 1973, JACM.

[8]  Chak-Kuen Wong,et al.  Algorithmic Studies in Mass Storage Systems , 1983, Springer Berlin Heidelberg.

[9]  Véronique Benzaken,et al.  Enhancing Performance in a Persistent Object Store: Clustering Strategies in O2 , 1990, POS.

[10]  Harvey F. Silverman,et al.  Placement of Records on a Secondary Storage Device to Minimize Access Time , 1973, JACM.

[11]  Roger King,et al.  Cactis: a self-adaptive, concurrent implementation of an object-oriented database management system , 1989, ACM Trans. Database Syst..

[12]  Earl R. Barnes Partitioning the nodes of a graph , 1985 .

[13]  James W. Stamos,et al.  Static grouping of small objects to enhance performance of a paged virtual memory , 1984, TOCS.

[14]  M. Hofri,et al.  The Working Set Size Distribution for the Markov Chain Model of Program Behavior , 1982, SIAM J. Comput..

[15]  Ronald Morrison,et al.  Persistent object management system , 1984, Softw. Pract. Exp..