Some experiments in the generation of word and document associations

The solution of most problems in automatic information dissemination and retrieval is dependent on the availability of methods for the automatic analysis of information content. It is, in fact, impossible to identify, classify, encode, and organize items of information, or requests for information, without first determining the content or subject matter of the information to be processed. In most proposed automatic systems, this analysis is based on a counting procedure which uses the frequency of occurrence of certain words or word classes to generate sets of index terms, and to prepare automatic abstracts or extracts.