Evaluating FPGA-acceleration for real-time unstructured search

Emerging data-centric workloads that operate on and harvest useful insights from large amounts of unstructured data require corresponding new data-centric system architecture optimizations. In particular, with the growing importance of power and cooling costs, a key challenge for such future designs is to achieve increased performance at high energy efficiency. At the same time, recent trends towards better support for reconfigurable logic enable the use of energy-efficient accelerators. Combining these trends, in this paper, we examine the applicability of acceleration in future data-centric system architectures. We focus on an important class of data-centric workloads, real-time unstructured search, or information filtering, where large collections of documents are scored against specific topic profiles, and present an FPGA-based implementation to accelerate such workloads. Our implementation, based on the GiDEL PROCStar IV board using Altera Stratix IV FPGAs, demonstrates excellent performance and energy efficiency, 20 to 40 times better than baseline server systems for typical usage scenarios. Our results also highlight interesting insights for the design of accelerators in future data-centric systems.

[1]  Maya Gokhale,et al.  Language classification using n-grams accelerated by FPGA-based Bloom filters , 2007, HPRCTA.

[2]  Wim Vanderbauwhede,et al.  FPGA-accelerated Information Retrieval: High-efficiency document filtering , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[3]  Amip J. Shah,et al.  Cost Model for Planning, Development and Operation of a Data Center , 2005 .

[4]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[5]  William H. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[6]  J. E. Glynn,et al.  Numerical Recipes: The Art of Scientific Computing , 1989 .

[7]  Marcelo A. Montemurro,et al.  Beyond the Zipf-Mandelbrot law in quantitative linguistics , 2001, ArXiv.

[8]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[9]  Robert M. Losee Term dependence: A basis for Luhn and Zipf models , 2001, J. Assoc. Inf. Sci. Technol..

[10]  S.G. Eick,et al.  Hardware accelerated algorithms for semantic processing of document streams , 2006, 2006 IEEE Aerospace Conference.

[11]  Scott Hauck,et al.  Impulse C vs. VHDL for Accelerating Tomographic Reconstruction , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[12]  Torsten Suel,et al.  Using graphics processors for high performance IR query processing , 2009, WWW.

[13]  Jan van Lunteren,et al.  High-Performance Pattern-Matching for Intrusion Detection , 2006, INFOCOM.

[14]  Martin Margala,et al.  A C++-embedded Domain-Specific Language for programming the MORA soft processor array , 2010, ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors.

[15]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[16]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.