Hardware Systems for Text Information Retrieval

As databases become very large, conventional digital computers cannot provide satisfactory response time. This is particularly true for text databases, which must often be several orders of magnitude larger than formatted databases to store a useful amount of information. Even the standard techniques for improving system performance (such as inverted files) may not be sufficient to give the desired performance, and the use of an unconventional hardware organization may become necessary.A variety of different organizations has been proposed to enhance processing of text retrieval operations. Most of these have concentrated on the design of fast, efficient search engines. These can be divided into three classes: associative memories, cellular pattern matchers, and finite state automata. The advantages and disadvantages inherent in each of these approaches are discussed, along with a number of proposed implementations. Finally, the text retrieval system under development at the University of Utah is discussed in more detail.

[1]  Jack A. Rudolph A production implementation of an associative array processor: STARAN , 1972, AFIPS '72 (Fall, part I).

[2]  Jayanta Banerjee,et al.  DBC—A Database Computer for Very Large Databases , 1979, IEEE Transactions on Computers.

[3]  Lee A. Hollaar Specialized merge processor networks for combining sorted lists , 1978, TODS.

[4]  Amar Mukhopadhyay Hardware algorithms for nonnumeric computation , 1978, ISCA '78.

[5]  H. T. Kung,et al.  The Design of Special-Purpose VLSI Chips , 1980, Computer.

[6]  J. B. Newsbaum,et al.  Text file inversion: an evaluation , 1978, CARN.

[7]  Roger L. Haskin,et al.  Operational characteristics of a harware-based pattern matcher , 1983, TODS.

[8]  Lee A. Hollaar,et al.  Text Retrieval Computers , 1979, Computer.

[9]  Jacob Slonim,et al.  NDX-100: An Electronic Filing Machine for the Office of the Future , 1981, Computer.

[10]  G. Jack Lipovski,et al.  The architecture of CASSM: A cellular system for non-numeric processing , 1973, ISCA '73.

[11]  Forbes J. Burkowski A Hardware Hashing Scheme in the Design of a Multiterm String Comparator , 1982, IEEE Transactions on Computers.

[12]  Roger L. Haskin Hardware for searching very large text databases , 1980, CAW '80.

[13]  R. M. Bird,et al.  Associative/parallel processors for searching very large textual data bases , 1977, CAW '77.

[14]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[15]  Kenneth C. Smith,et al.  RAP: an associative processor for data base management , 1975, AFIPS '75.

[16]  David C. Roberts A specialized computer architecture for text retrieval , 1978 .