Implementing ranking strategies using text signatures

Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective retrieval requires that documents with similar meanings be found through a process of plausible inference. The simplest way of implementing this retrieval process is to rank documents in order of their probability of relevance. In this paper techniques are described for implementing probabilistic ranking strategies with sequential and bit-sliced signature tiles and the limitations of these implementations with regard to their effectiveness are pointed out. A detailed comparison is made between signature-based ranking techniques and ranking using term-based document representatives and inverted files. The comparison shows that term-based representations are at least competitive (in terms of efficiency) with signature files and, in some situations, superior.

[1]  Stavros Christodoulakis,et al.  Message files , 1982, TOIS.

[2]  C. J. van Rijsbergen,et al.  A Non-Classical Logic for Information Retrieval , 1997, Comput. J..

[3]  Joel L. Fagan,et al.  Automatic Phrase Indexing for Document Retrieval: An Examination of Syntactic and Non-Syntactic Methods , 1987, SIGIR.

[4]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[5]  C. J. van Rijsbergen,et al.  The nearest neighbour problem in information retrieval: an algorithm using upperbounds , 1981, SIGIR '81.

[6]  FaloutsosChristos,et al.  Description and performance analysis of signature file methods for office filing , 1987 .

[7]  W. Bruce Croft,et al.  A comparison of a network structure and a database system used for document retrieval , 1985, Inf. Syst..

[8]  Fausto Rabitti,et al.  Evaluation of Access Methods to Text Document in Office Systems , 1984, SIGIR.

[9]  Kotagiri Ramamohanarao,et al.  A two level superimposed coding scheme for partial match retrieval , 1983, Inf. Syst..

[10]  S. Christodoulakis,et al.  A Multimedia Document Server , 1986, IEEE Aerospace and Electronic Systems Magazine.

[11]  W. Bruce Croft,et al.  Interactive retrieval office documents , 1988, COIS.

[12]  W. B. Croft,et al.  Interactive retrieval office documents , 1988 .

[13]  W. Bruce Croft Document representation in probabilistic models of information retrieval , 1981, J. Am. Soc. Inf. Sci..

[14]  Christos Faloutsos,et al.  Description and performance analysis of signature file methods for office filing , 1987, TOIS.

[15]  Christos Faloutsos,et al.  Design Considerations for a Message File Server , 1984, IEEE Transactions on Software Engineering.

[16]  Alan F. Smeaton,et al.  Incorporating syntactic information into a document retrieval strategy: an investigation , 1986, SIGIR '86.

[17]  Edward A. Fox,et al.  Research Contributions , 2014 .

[18]  C.S. Roberts,et al.  Partial-match retrieval via the method of superimposed codes , 1979, Proceedings of the IEEE.

[19]  Craig Stanfill,et al.  Parallel free-text search on the connection machine system , 1986, CACM.

[20]  W. Bruce Croft Boolean queries and term dependencies in probabilistic retrieval models , 1986, J. Am. Soc. Inf. Sci..

[21]  Chris Buckley,et al.  Optimization of inverted vector searches , 1985, SIGIR '85.

[22]  Michael F. Lynch,et al.  Document Retrieval Using a Serial Bit String Search , 1983, Inf. Process. Manag..

[23]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[24]  Harold Borko,et al.  Automatic indexing , 1981, ACM '81.

[25]  Harold S. Stone,et al.  Parallel Querying of Large Databases: A Case Study , 1987, Computer.

[26]  Nicholas J. Belkin,et al.  Retrieval techniques , 1987 .

[27]  Gerard Salton,et al.  Automatic indexing , 1980, ACM '80.

[28]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[29]  Christos Faloutsos,et al.  Signature files: an access method for documents and its analytical performance evaluation , 1984, TOIS.

[30]  W. Bruce Croft Experiments with automatic text filing and retrieval in the office environment , 1982, SIGF.