Posting compression in dynamic retrieval environments

This paper describes a posting compression technique to be used in dynamic full-text document retrieval environments. The compression technique being presented is applicable in main-memory document retrieval systems, and consists of two parts. First there is the efficient use of auxiliary tables, and second there is the application of the well-known rankfrequency law of Zipf. It is shown that on the basis of this law term weights can be approximated, and thus that their explicit storage can be avoided.

[1]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[2]  Frans Sijstermans,et al.  InfoGuide: A Full-Text Document Retrieval System , 1990, DEXA.

[3]  Maria Elena Smith,et al.  Aspects of the P-Norm Model of Information Retrieval: Syntactic Query Generation, Efficiency, And Theoretical , 1990 .

[4]  Donna Harman,et al.  Retrieving Records from a Gigabyte of Text on a Minicomputer Using Statistical Ranking. , 1990 .

[5]  J. Pritchard Electronic filing and retrieval: developments in full text retrieval systems , 1990 .

[6]  Karen Sparck Jones A statistical interpretation of term specificity and its application in retrieval , 1972 .

[7]  Peter Willett,et al.  A review of the use of inverted files for best match searching in information retrieval systems , 1983 .

[8]  Edward A. Fox,et al.  Research Contributions , 2014 .

[9]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[10]  Craig Stanfill,et al.  Parallel free-text search on the connection machine system , 1986, CACM.

[11]  Dario Lucarella A Search Strategy for Large Document Bases , 1988, Electron. Publ..

[12]  David L. Waltz,et al.  A parallel indexed algorithm for information retrieval , 1989, SIGIR '89.

[13]  Shmuel Tomi Klein,et al.  Compression of concordances in full-text retrieval systems , 1988, SIGIR '88.

[14]  Frans Sijstermans,et al.  CD-I full-motion video encoding on a parallel computer , 1991, CACM.

[15]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[16]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[17]  Wim Bronnenberg,et al.  DOOM: A Decentralized Object-Oriented Machine , 1987, IEEE Micro.

[18]  Frans Sijstermans,et al.  High-quality and high-performance full-text document retrieval: the Parallel InfoGuide System , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[19]  P. C. Treleaven Parallel computers: object-oriented, functional, logic , 1990 .

[20]  Michael B. Eisenberg,et al.  A re-examination of relevance: toward a dynamic, situational definition , 1990, Inf. Process. Manag..