FAST-INV: A Fast Algorithm for building large inverted files
暂无分享,去创建一个
Inverted files are widely used in building bibliographic and other types of retrieval systems. In order to investigate the utility of advance information retrieval methods for improving access to large online library catalogs, it was necessary to extend the SMART system in a variety of ways. One particular problem was to develop a fast method to produce an inverted file from hundreds of thousands of (partial) MARC records. The FAST-INV software was developed in 1986, taking advantage of the large primary memories available on modern computers and the order inherent in the input data. Using the new algorithm, processing in primary memory for N basic data elements has time complexity O(N), and processing of files that will not fit in primary memory can be accomplished in a fixed number of passes. Performance studies show this approach to be (at least) an order of magnitude faster than commonly used techniques. It is hoped that these findings will be of interest to database providers and will help them reduce costs relating to the building of inverted files, as we have been doing for the last five years.