Enhancing Data Retrieval using Artificially synthesized Queries

Abstract One of the areas where there is an enormous amount of overlap between Computer Science, Operations Research and Optimization is in the field of adaptive data retrieval and storage. In this paper we shall consider the adaptive reorganization of data, which is achieved not by using the user's query stream but rather by using a synthesized query stream which has more concentrated statistical information about the user's query than the original. More formally, let R = {R 1 ,R 2 ,…, R n } be a set of data elements. The elements of R are accessed by the users of the system according to a fixed but unknown distribution S={s 1 ,S 2 ,…, s n }, referred to as the user's query distribution. In this paper we have considered the adaptive reorganization of R. However, rather than organize the data according to Q, the stream of queries presented by the user, it is reorganized based on a synthesized query stream Q' This synthesized stream possesses an underlying distribution, S'. Observe that by doing this we effectively modify the user's query distribution without his knowing it. In this paper, we show how this transformation can be done in such a way that the distribution S' is more “polarized” with regard to its information content than the original distribution S. The module which achieves this is called a Distribution Changing Technique (DCT) filter. The paper presents the theory of DCT filters (viewed as Stochastic Mealy Automata) and various DCT filters are catalogued. Finally, the problem of cascading DCT filters has been studied, and various simulation results which justify the theoretical results presented have been included.

[1]  W. J. Hendricks An Account of Self-Organizing Systems , 1976, SIAM J. Comput..

[2]  Daniel Chiu Yu Ma Object partitioning by using learning automata , 1986 .

[3]  Clement T. Yu,et al.  Precision Weighting—An Effective Automatic Indexing Method , 1976, J. ACM.

[4]  B. John Oommen,et al.  Deterministic Optimal and Expedient Move-to-Rear List Organizing Strategies , 1990, Theor. Comput. Sci..

[5]  Daniel S. Hirschberg,et al.  Self-organizing linear search , 1985, CSUR.

[6]  Michael Hammer,et al.  A heuristic approach to attribute partitioning , 1979, SIGMOD '79.

[7]  Kumpati S. Narendra,et al.  Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[8]  Aaron M. Tenenbaum,et al.  Two Spectra of Self-Organizing Sequential Search Algorithms , 1982, SIAM J. Comput..

[9]  B. John Oommen,et al.  On Using Conditional Rotation Operations to Adaptively Structure Binary Search Trees , 1988, ICDT.

[10]  John McCabe,et al.  On Serial Files with Relocatable Records , 1965 .

[11]  B. John Oommen,et al.  Stochastic Automata Solutions to the Object Partitioning Problem , 1991, Comput. J..

[12]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[13]  Ronald L. Rivest,et al.  On self-organizing sequential search heuristics , 1976, CACM.

[14]  B. John Oommen,et al.  List Organizing Strategies Using Stochastic Move-to-Front and Stochastic Move-to-Rear Operations , 1987, SIAM J. Comput..

[15]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[16]  B. John Oommen,et al.  Deterministic Learning Automata Solutions to the Equipartitioning Problem , 1988, IEEE Trans. Computers.

[17]  S. M. Ross,et al.  Optimal list order under partial memory constraints , 1980, Journal of Applied Probability.

[18]  James R. Bitner,et al.  Heuristics That Dynamically Organize Data Structures , 1979, SIAM J. Comput..

[19]  Mario Schkolnick,et al.  A clustering algorithm for hierarchical structures , 1977, TODS.

[20]  P. J. Burville,et al.  On a model for storage and search , 1973, Journal of Applied Probability.

[21]  Clement T. Yu,et al.  Adaptive record clustering , 1985, TODS.

[22]  Gaston H. Gonnet,et al.  Exegesis of Self-Organizing Linear Search , 1981, SIAM J. Comput..

[23]  J. Ian Munro,et al.  Self-organizing binary search trees , 1976, 17th Annual Symposium on Foundations of Computer Science (sfcs 1976).