FPGA Acceleration of RankBoost in Web Search Engines

Search relevance is a key measurement for the usefulness of search engines. Shift of search relevance among search engines can easily change a search company's market cap by tens of billions of dollars. With the ever-increasing scale of the Web, machine learning technologies have become important tools to improve search relevance ranking. RankBoost is a promising algorithm in this area, but it is not widely used due to its long training time. To reduce the computation time for RankBoost, we designed a FPGA-based accelerator system and its upgraded version. The accelerator, plugged into a commodity PC, increased the training speed on MSN search engine data up to 1800x compared to the original software implementation on a server. The proposed accelerator has been successfully used by researchers in the search relevance ranking.

[1]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[2]  Dingxing Wang,et al.  Boosting image classification with LDA-based feature combination for digital photograph management , 2005, Pattern Recognit..

[3]  Tarek A. El-Ghazawi,et al.  Is High-Performance, Reconfigurable Computing the Next Supercomputing Paradigm? , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[6]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[7]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[8]  Edward A. Fox,et al.  Ranking function optimization for effective Web search by genetic programming: an empirical study , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[9]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[10]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[11]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[12]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[13]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[14]  Ivan Laptev,et al.  Improvements of Object Detection Using Boosted Histograms , 2006, BMVC.

[15]  Karl S. Hemmert,et al.  Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[16]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[17]  Yoram Singer,et al.  Boosting for document routing , 2000, CIKM '00.

[18]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[19]  Norbert Fuhr,et al.  Optimum polynomial retrieval functions based on the probability ranking principle , 1989, TOIS.

[20]  Tao Qin,et al.  Learning to Search Web Pages with Query-Level Loss Functions , 2006 .

[21]  Lei Zhang,et al.  FPGA-based Accelerator Design for RankBoost in Web Search Engines , 2007, 2007 International Conference on Field-Programmable Technology.