Exploiting parallelism in pattern matching: an information retrieval application

We propose a document-searching architecture based on high-speed hardware pattern matching to increase the throughput of an information retrieval system. We also propose a new parallel VLSI pattern-matching algorithm called the Data Parallel Pattern Matching (DPPM) algorithm, which serially broadcasts and compares the pattern to a block of data in parallel. The DPPM algorithm utilizes the high degree of integration of VLSI technology to attain very high-speed processing through parallelism. Performance of the DPPM has been evaluated both analytically and by simulation. Based on the simulation statistics and timing analysis on the hardware design, a search rate of multiple gigabytes per second is achievable using 2-μm CMOS technology. The potential performance of the proposed document-searching architecture is also analyzed using the simulation statistics of the DPPM algorithm.

[1]  Roger L. Haskin,et al.  Architecture and Operation of a Large, Full-Text Information-Retrieval System , 1983, Advanced Database Machine Architecture.

[2]  Roger L. Haskin,et al.  Special-Purpose Processors for Text Retrieval. , 1981 .

[3]  Neil Weste,et al.  Principles of CMOS VLSI Design , 1985 .

[4]  The super-searcher. , 1983, Australian hospital.

[5]  Sakti Pramanik Performance Analysis of a Database Filter Search Hardware , 1986, IEEE Transactions on Computers.

[6]  K. Takahashi,et al.  Intelligent String Search Processor to Accelerate Text Information Retrieval , 1987, IWDM.

[7]  Roger L. Haskin,et al.  Operational characteristics of a harware-based pattern matcher , 1983, TODS.

[8]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[9]  H. T. Kung,et al.  The Design of Special-Purpose VLSI Chips , 1980, Computer.

[10]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[11]  Lee A. Hollaar,et al.  Text Retrieval Computers , 1979, Computer.

[12]  C.A. Mead,et al.  128-bit multicomparator , 1976, IEEE Journal of Solid-State Circuits.

[13]  W. S. Marcus,et al.  A CMOS Batcher and Banyan chip set for B-ISDN packet switching , 1990 .

[14]  R.R. Johnson Multichip modules: next-generation packages , 1990, IEEE Spectrum.

[15]  Bruce A. Conway,et al.  NASA Spaceborne Optical Disk Recorder Development , 1988, Photonics West - Lasers and Applications in Science and Engineering.

[16]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[17]  Lambertus Hesselink,et al.  Data Storage In Photorefractives Revisited , 1989, Other Conferences.

[18]  I. Masuda,et al.  Perspective on BiCMOS VLSIs , 1988 .

[19]  Arne Halaas,et al.  A systolic VLSI matrix for a family of fundamental searching problems , 1983, Integr..

[20]  Edward A. Fox,et al.  Research Contributions , 2014 .

[21]  Craig Stanfill,et al.  Parallel free-text search on the connection machine system , 1986, CACM.

[22]  P. Bruce Berra,et al.  Optical Techniques and Data/Knowledge Base Machines , 1987, Computer.