A generic machine for parallel information retrieval

Abstract Information retrieval is the recovery of documents that match a requester's query. Ideally, this process must be both effective and efficient. Of the myriad of strategies proposed by researchers in the literature, several have been hardware enhancements to retrieval. Mostly, these have had to do with the utilization of file servers, intermediate buffer memories, array processors, and associative memories/coprocessors. In this article, we introduce two distribution schemes that partition documents over multiple processors and the corresponding multiprocessor retrieval algorithms that match relevant documents to user queries. The suggested framework is based on a general-purpose hypercube multicomputer architecture with a dedicated disk for each node. This hardware feature, together with special storage and access methods, makes the proposal a justifiable and viable application of parallelism in information retrieval.