Mathematics and Information Retrieval

The development of a given discipline in science and technology often depends on the availability of theories capable of describing the processes which control the field and of modelling the interactions between these processes. The absence of an accepted theory of information retrieval has been blamed for the relative disorder and the lack of technical advances in the area. The main mathematical approaches to information retrieval are examined in this study, including both algebraic and probabilistic models, and the difficulties which impede the formalization of information retrieval processes are described. A number of developments are covered where new theoretical understandings have directly led to the improvement of retrieval techniques and operations.

[1]  Samuel Schiminovich Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm , 1971, Inf. Storage Retr..

[2]  Sakti P. Ghosh On the theory of consecutive storage of relevant records , 1973, Inf. Sci..

[3]  Morris Rubinoff,et al.  Statistical generation of a technical vocabulary , 1968 .

[4]  William Goffman,et al.  A searching procedure for information retrieval , 1964, Inf. Storage Retr..

[5]  Clement T. Yu,et al.  Effective Automatic Indexing Using Term Addition and Deletion , 1978, JACM.

[6]  Robert A. Fairthorne Empirical hyperbolic distributions (Bradford-Zipf-Mandelbrot) for bibliometric description and prediction , 1969 .

[7]  James E. Rush,et al.  Toward a theory of indexing—II , 1970 .

[8]  John A. Swets,et al.  Effectiveness of information retrieval methods , 1969 .

[9]  Clement T. Yu,et al.  On the estimation of the number of desired records with respect to a given query , 1978, TODS.

[10]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[11]  Fred J. Damerau,et al.  An experiment in automatic indexing , 1965 .

[12]  Louis Hodes,et al.  Selection of Descriptors According to Discrimination and Redundancy. Application to Chemical Structure Searching , 1976, J. Chem. Inf. Comput. Sci..

[13]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[14]  F. Mosteller,et al.  Inference in an Authorship Problem , 1963 .

[15]  Jack Minker,et al.  An Analysis of Some Graph Theoretical Cluster Techniques , 1970, JACM.

[16]  Tadeusz Radecki Mathematical model of information retrieval system based on the concept of Fuzzy thesaurus , 1976, Inf. Process. Manag..

[17]  Wladimir M. Sachs,et al.  An approach to associative retrieval through the theory of fuzzy sets , 1976, J. Am. Soc. Inf. Sci..

[18]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[19]  Clement T. Yu,et al.  Precision Weighting—An Effective Automatic Indexing Method , 1976, J. ACM.

[20]  Gerard Salton,et al.  Generation and search of clustered files , 1978, TODS.

[21]  P. Zunde,et al.  Distribution of indexing terms for maximum efficiency of information transmission , 1967 .

[22]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[23]  Gerard Salton,et al.  A theory of indexing , 1975, Regional conference series in applied mathematics.

[24]  Don R. Swanson,et al.  A decision theoretic foundation for indexing , 1975, J. Am. Soc. Inf. Sci..

[25]  Clement T. Yu,et al.  Effective information retrieval using term accuracy , 1977, CACM.

[26]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[27]  Harold Borko,et al.  Automatic Document Classification , 1963, JACM.

[28]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature , 1975, J. Am. Soc. Inf. Sci..

[29]  Eugene Wall,et al.  The distribution of term usage in manipulative indexes , 1964 .

[30]  H. P. Edmundson,et al.  Automatic abstracting and indexing—survey and recommendations , 1961, CACM.

[31]  Valiollah Tahani,et al.  A fuzzy model of document retrieval systems , 1976, Inf. Process. Manag..

[32]  R. K. Waldstein,et al.  Term relevance weights in on-line information retrieval , 1977, Inf. Process. Manag..

[33]  W. S. Cooper Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems , 1968 .

[34]  J A Swets,et al.  Information Retrieval Systems. , 1963, Science.

[35]  Tadeusz Radecki Mathematical model of time-effective information retrieval system based on the theory of fuzzy sets , 1977, Inf. Process. Manag..

[36]  W. Bruce Croft Clustering large files of documents using the single-link method , 1977, J. Am. Soc. Inf. Sci..

[37]  M. E. Maron,et al.  Automatic Indexing: An Experimental Inquiry , 1961, JACM.

[38]  Don R. Swanson,et al.  Probabilistic models for automatic indexing , 1974, J. Am. Soc. Inf. Sci..

[39]  Sakti P. Ghosh File organization , 1972, Commun. ACM.