What makes an automatic keyword classification effective

Though the idea of constructing a keyword classification for retrieval purposes automatically is not a new one, comparatively few systematic experiments have been carried out in this area; and while many suggestions have been put forward, not enough is known about the behaviour of automatic keyword classifications, and hence about the properties such classifications should have and the ways they should be used. In previous experiments we showed that some forms of classification could give good results, and this paper describes a further series of tests designed to examine this sort of classification in more detail, with a view to establishing the optimum forms of classification and procedures for using them in different retrieval situations. These tests demonstrate that further improvements in performance over that for unclassified keywords can be obtained, and that definite conclusions can be drawn about the correct approach to classification for collections like the test one: the best results are given when grouping is confined to strongly connected, non‐frequent keywords, when the classification is used to provide additional rather than alternative indexing terms, particularly for requests, and when matching is controlled by keyword collection frequency.