Parallel computing in information retrieval - an updated review

The progress of parallel computing in Information Retrieval (IR) is reviewed. In particular we stress the importance of the motivation in using parallel computing for text retrieval. We analyse parallel IR systems using a classification defined by Rasmussen and describe some parallel IR systems. We give a description of the retrieval models used in parallel information processing. We describe areas of research which we believe are needed.

[2]  Hector Garcia-Molina,et al.  Performance of inverted indices in shared-nothing distributed text document information retrieval systems , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.

[3]  Craig Stanfill,et al.  An Information Retrieval Test-bed on the CM-5 , 1993, TREC.

[4]  David S. Reiner,et al.  Parallel database processing on the KSR1 computer , 1993, SIGMOD '93.

[5]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[6]  Gaston H. Gonnet,et al.  New Indices for Text: Pat Trees and Pat Arrays , 1992, Information Retrieval: Data Structures & Algorithms.

[7]  J. Gerard Wolff ‘What is SP?’: A Reply , 1995 .

[8]  Hava T. Siegelmann,et al.  On the allocation of documents in multiprocessor information retrieval systems , 1991, SIGIR '91.

[9]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[10]  Hector Garcia-Molina,et al.  Synthetic workload performance analysis of incremental updates , 1994, SIGIR '94.

[11]  Ophir Frieder,et al.  A Parallel DBMS Approach to IR in TREC-3 , 1994, TREC.

[12]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[13]  Lee A. Hollaar,et al.  Special-Purpose Hardware for Information Retrieval , 1992, Information retrieval (Boston).

[14]  Roger W. Hockney,et al.  Performance parameters and benchmarking of supercomputers , 1991, Parallel Comput..

[15]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[16]  Peter Willett,et al.  Nearest Neighbour Searching in Binary Search Trees: Simulation of a Multiprocessor System , 1987, J. Documentation.

[17]  V. Rajaraman,et al.  Parallel search methods of a document database in a distributed computer system: a case study , 1990, J. Inf. Sci..

[18]  Peter Willett,et al.  Bibliographic pattern matching using the ICL Distributed Array Processor , 1988, Journal of the American Society for Information Science.

[19]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[20]  Ravi Sharma A generic machine for parallel information retrieval , 1989, Inf. Process. Manag..

[21]  Christos Faloutsos,et al.  Bit-Sliced Signature Files for Very Large Text Databases an a Parallel Machine Architecture , 1994, EDBT.

[22]  Edward A. Fox,et al.  Extended Boolean Models , 1992, Information retrieval (Boston).

[23]  Harold S. Stone,et al.  Parallel Querying of Large Databases: A Case Study , 1987, Computer.

[24]  Hava T. Siegelmann,et al.  Document Allocation In Multiprocessor Information Retrieval Systems , 1993, Advanced Database Systems.

[25]  Gerard Salton,et al.  Parallel computations in information retrieval , 1981, CONPAR.

[26]  Pavel Zezula,et al.  Frame-sliced partitioned parallel signature files , 1992, SIGIR '92.

[27]  Niklaus Wirth,et al.  Algorithms & data structures , 1985 .

[28]  A. Tridgell,et al.  A PADRE in MUFTI (A Multi User Free Text retrieval Intermediary) , 1995 .

[29]  Jack J. Dongarra,et al.  A message passing standard for MPP and workstations , 1996, CACM.

[30]  Gerard Salton,et al.  Parallel text search methods , 1988, CACM.

[31]  Craig Stanfill Partitioned posting files: a parallel inverted file structure for information retrieval , 1989, SIGIR '90.

[32]  Paul G. Spirakis,et al.  Parallel text retrieval on a high performance supercomputer using the Vector Space Model , 1995, SIGIR '95.

[33]  Ali R. Hurson,et al.  Specialized Parallel Architectures for Textual Databases , 1990, Adv. Comput..

[34]  Esen A. Ozkarahan,et al.  System architecture for information processing , 1991, Inf. Process. Manag..

[35]  Peter Bailey,et al.  A parallel architecture for query processing over a terabyte of text , 1996 .

[36]  M. Malik,et al.  Operating Systems , 1992, Lecture Notes in Computer Science.

[37]  Peter Willett,et al.  Text searching algorithms for parallel processors , 1987 .

[38]  Craig Stanfill,et al.  Parallel free-text search on the connection machine system , 1986, CACM.

[39]  Kui-Lam Kwok A neural network for probabilistic information retrieval , 1989, SIGIR '89.

[40]  Peter Willett,et al.  Use of text signatures for document retrieval in a highly parallel environment , 1987, Parallel Comput..

[41]  Peter Willett,et al.  Nearest-neighbour Searching in Files of Text Signatures Using Transputer Networks , 1991, Electron. Publ..

[42]  Kui-Lam Kwok,et al.  TREC-2 Document Retrieval Experiments using PIRCS , 1993, TREC.

[43]  David B. Skillicorn Structured Parallel Computation in Structured Documents , 1995 .

[44]  David L. Waltz,et al.  A parallel indexed algorithm for information retrieval , 1989, SIGIR '89.

[45]  Craig Stanfill,et al.  Information retrieval on the connection machine: 1 to 8192 gigabytes , 1991, Inf. Process. Manag..

[46]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[47]  William B. Frakes,et al.  Introduction to Information Storage and Retrieval Systems , 1992, Information Retrieval: Data Structures & Algorithms.

[48]  S. F. Reddaway High speed text retrieval from large databases on a massively parallel processor , 1991, Inf. Process. Manag..

[49]  John K. Vries,et al.  The medical archival system: An information retrieval system based on distributed parallel processing , 1991, Inf. Process. Manag..

[50]  David Hawking,et al.  The design and implementation of a parallel document retrieval engine , 1995 .

[51]  Peter Willett,et al.  Efficiency of text scanning in bibliographic databases using microprocessor-based, multiprocessor networks , 1988, J. Inf. Sci..

[52]  Peter Willett,et al.  Network design for the implementation of text searching using a multicomputer , 1991, Inf. Process. Manag..

[53]  David L. Waltz,et al.  Applications of the Connection Machine , 1990, Computer.

[54]  David B. Skillicorn A Generalisation of Indexing for Parallel Document Search , 1995 .

[55]  J. Gerard Wolff,et al.  A scaleable technique for best-match retrieval of sequential information using metrics-guided search , 1994, J. Inf. Sci..

[56]  Robert N. Oddy,et al.  Pthomas: An adaptive information retrieval system on the connection machine , 1991, Inf. Process. Manag..

[57]  Edie M. Rasmussen,et al.  Introduction: Parallel processing and information retrieval , 1991, Inf. Process. Manag..

[58]  Kai Hwang,et al.  Advanced computer architecture - parallelism, scalability, programmability , 1992 .

[59]  Edie M. Rasmussen,et al.  Searching and clustering of databases using the ICL distributed array processor , 1988, Parallel Comput..

[60]  Peter Willett,et al.  Parallel text searching in serial files using a processor farm , 1989, SIGIR '90.

[61]  Hector Garcia-Molina,et al.  Caching and database scaling in distributed shared-nothing information retrieval systems , 1993, SIGMOD '93.

[62]  Paul Mather,et al.  What is SP? , 1994, Comput. J..

[63]  Peter Bailey,et al.  Towards a Practical Information Retrieval System for the Fujitsu AP1000 , 1993 .

[64]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[65]  Julie A. McCann,et al.  On Concurrency Control for Inverted Files , 1995 .

[66]  Craig Stanfill Parallel Information Retrieval Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[67]  David B. Skillicorn Structured Parallel Parallel Computation in Structured Documents , 1997, J. Univers. Comput. Sci..

[68]  M. E. Maron,et al.  Full-text information retrieval: Further analysis and clarification , 1990, Inf. Process. Manag..

[69]  D. Hawking PADRE | A Parallel Document Retrieval Engine , 1994 .

[70]  Craig Stanfill,et al.  Compression of indexes with full positional information in very large text databases , 1993, SIGIR.

[71]  Edie M. Rasmussen,et al.  Parallel Information Processing. , 1992 .

[72]  David Hawking,et al.  Proximity Operators - So Near And Yet So Far , 1995, TREC.

[73]  Kaisa Sere,et al.  Free text retrieval on transputer networks , 1989, Microprocess. Microsystems.

[74]  Christos Faloutsos,et al.  Access methods for text , 1985, CSUR.

[75]  W. Robertson,et al.  A neural algorithm for document clustering , 1991, Inf. Process. Manag..

[76]  Ophir Frieder,et al.  Improving Accuracy and Run-Time Performance for TREC-4 , 1995, TREC.

[77]  Gordon Bell,et al.  Ultracomputers: a teraflop before its time , 1992, CACM.

[78]  David Hawking,et al.  Searching For Meaning With The Help Of A PADRE , 1994, TREC.

[79]  Byeong-Soo Jeong,et al.  Inverted File Partitioning Schemes in Multiple Disk Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[80]  Scott C. Deerwester,et al.  An Architecture for Full Text Retrieval Systems , 1990, DEXA.

[81]  Peter Willett,et al.  Bibliographic pattern matching using the ICL Distributed Array Processor , 1988 .

[82]  Edie M. Rasmussen,et al.  Efficiency of Hierarchic Agglomerative Clustering using the ICL Distributed array Processor , 1989, J. Documentation.

[83]  Frans Sijstermans,et al.  InfoGuide: A Full-Text Document Retrieval System , 1990, DEXA.