Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees

Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost of scoring thousands of candidate documents by traversing large ensembles of trees is high. Thus, several works have investigated solutions aimed at improving the efficiency of document scoring by exploiting advanced features of modern CPUs and memory hierarchies. In this article, we present QuickScorer, a new algorithm that adopts a novel cache-efficient representation of a given tree ensemble, performs an interleaved traversal by means of fast bitwise operations, and supports ensembles of oblivious trees. An extensive and detailed test assessment is conducted on two standard Learning-to-Rank datasets and on a novel very large dataset we made publicly available for conducting significant efficiency tests. The experiments show unprecedented speedups over the best state-of-the-art baselines ranging from 1.9 × to 6.6 × . The analysis of low-level profiling traces shows that QuickScorer efficiency is due to its cache-aware approach in terms of both data layout and access patterns and to a control flow that entails very low branch mis-prediction rates.

[1]  Andrey Gulin,et al.  Winning The Transfer Learning Track of Yahoo!'s Learning To Rank Challenge with YetiRank , 2010, Yahoo! Learning to Rank Challenge.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Ron Kohavi,et al.  Bottom-Up Induction of Oblivious Read-Once Decision Graphs: Strengths and Limitations , 1994, AAAI.

[4]  Lidan Wang,et al.  Learning to efficiently rank , 2010, SIGIR.

[5]  David A. Patterson,et al.  Computer Organization and Design, Fifth Edition: The Hardware/Software Interface , 2013 .

[6]  Cristina V. Lopes,et al.  Bagging gradient-boosted trees for high precision, low variance ranking models , 2011, SIGIR.

[7]  Toby Sharp,et al.  Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[8]  Raffaele Perego,et al.  Quality versus efficiency in document scoring with learning-to-rank models , 2016, Inf. Process. Manag..

[9]  Tao Yang,et al.  Cache-conscious runtime optimization for ranking ensembles , 2014, SIGIR.

[10]  Fabrizio Silvestri,et al.  Post-Learning Optimization of Tree Ensembles for Efficient Ranking , 2016, SIGIR.

[11]  Salvatore Orlando,et al.  Exploiting CPU SIMD Extensions to Speed-up Document Scoring with Tree Ensembles , 2016, SIGIR.

[12]  Jimmy J. Lin,et al.  Training Efficient Tree-Based Models for Document Ranking , 2013, ECIR.

[13]  Berkant Barla Cambazoglu,et al.  Early exit optimizations for additive machine learned ranking systems , 2010, WSDM '10.

[14]  Jimmy J. Lin,et al.  A cascade ranking model for efficient ranked retrieval , 2011, SIGIR.

[15]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[16]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[17]  Raffaele Perego,et al.  QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees , 2015, SIGIR.

[18]  David A. Patterson,et al.  Computer Organization and Design, Fourth Edition, Fourth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) , 2008 .

[19]  Salvatore Orlando,et al.  QuickRank: a C++ Suite of Learning to Rank Algorithms , 2015, IIR.

[20]  Pat Langley,et al.  Oblivious Decision Trees and Abstract Cases , 1994 .

[21]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[22]  Maya Gokhale,et al.  Accelerating a Random Forest Classifier: Multi-Core, GP-GPU, or FPGA? , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[23]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[24]  Jimmy J. Lin,et al.  Runtime Optimizations for Tree-Based Machine Learning Models , 2014, IEEE Transactions on Knowledge and Data Engineering.

[25]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[26]  Hongbo Deng,et al.  Ranking Relevance in Yahoo Search , 2016, KDD.

[27]  Tao Yang,et al.  A Comparison of Cache Blocking Methods for Fast Execution of Ensemble-based Score Computation , 2016, SIGIR.

[28]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[29]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[30]  Jimmy J. Lin,et al.  Ranking under temporal constraints , 2010, CIKM.

[31]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.