Challenges and Opportunities of Building Fast GBDT Systems

In the last few years, Gradient Boosting Decision Trees (GBDTs) have been widely used in various applications such as online advertising and spam filtering. However, GBDT training is often a key performance bottleneck for such data science pipelines, especially for training a large number of deep trees on large data sets. Thus, many parallel and distributed GBDT systems have been researched and developed to accelerate the training process. In this survey paper, we review the recent GBDT systems with respect to accelerations with emerging hardware as well as cluster computing, and compare the advantages and disadvantages of the existing implementations. Finally, we present the research opportunities and challenges in designing fast next generation GBDT systems.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Inderjit S. Dhillon,et al.  Gradient Boosted Decision Trees for High Dimensional Sparse Output , 2017, ICML.

[3]  Kalyan Veeramachaneni,et al.  Deep feature synthesis: Towards automating data science endeavors , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[4]  Roberto J. Bayardo,et al.  PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce , 2009, Proc. VLDB Endow..

[5]  Stephen Tyree,et al.  Parallel boosted regression trees for web search ranking , 2011, WWW.

[6]  Shyan-Ming Yuan,et al.  CUDT: A CUDA Based Decision Tree Algorithm , 2014, TheScientificWorldJournal.

[7]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[8]  Bingsheng He,et al.  iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures , 2021, IEEE Transactions on Parallel and Distributed Systems.

[9]  Jiawei Jiang,et al.  An Experimental Evaluation of Large Scale GBDT Systems , 2019, Proc. VLDB Endow..

[10]  Ameet Talwalkar,et al.  MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..

[11]  Lior Rokach,et al.  AugBoost: Gradient Boosting Enhanced with Step-Wise Feature Augmentation , 2019, IJCAI.

[12]  Shaohua Kevin Zhou,et al.  Fast boosting trees for classification, pose detection, and boundary detection on a GPU , 2011, CVPR 2011 WORKSHOPS.

[13]  Bingsheng He,et al.  ThunderSVM: A Fast SVM Library on GPUs and CPUs , 2018, J. Mach. Learn. Res..

[14]  Tie-Yan Liu,et al.  DeepGBM: A Deep Learning Framework Distilled by GBDT for Online Prediction Tasks , 2019, KDD.

[15]  Anna Veronika Dorogush,et al.  Why every GBDT speed benchmark is wrong , 2018, ArXiv.

[16]  Bingsheng He,et al.  Efficient Gradient Boosted Decision Tree Training on GPUs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[17]  Wannes Meert,et al.  Fast Gradient Boosting Decision Trees with Bit-Level Data Structures , 2019, ECML/PKDD.

[18]  Håkan Grahn,et al.  CudaRF: A CUDA-based implementation of Random Forests , 2011, 2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA).

[19]  Takuya Tanaka,et al.  Efficient logic architecture in training gradient boosting decision tree for high-performance and edge computing , 2018, ArXiv.

[20]  Jiawei Jiang,et al.  DimBoost: Boosting Gradient Boosting Decision Tree to Higher Dimensions , 2018, SIGMOD Conference.

[21]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[22]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[23]  Aziz Nasridinov,et al.  Decision tree construction on GPU: ubiquitous parallel computing approach , 2013, Computing.

[24]  Miao He,et al.  Robust Online Dynamic Security Assessment Using Adaptive Ensemble Decision-Tree Learning , 2013, IEEE Transactions on Power Systems.

[25]  Anna Veronika Dorogush,et al.  CatBoost: gradient boosting with categorical features support , 2018, ArXiv.

[26]  Henrik Boström,et al.  Block-distributed Gradient Boosted Trees , 2019, SIGIR.

[27]  Toby Sharp,et al.  Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[28]  Bingsheng He,et al.  ThunderGBM: Fast GBDTs and Random Forests on GPUs , 2020, J. Mach. Learn. Res..

[29]  Tie-Yan Liu,et al.  A Communication-Efficient Parallel Algorithm for Decision Tree , 2016, NIPS.

[30]  Eibe Frank,et al.  Accelerating the XGBoost algorithm using GPU computing , 2017, PeerJ Comput. Sci..

[31]  Damjan Strnad,et al.  Parallel construction of classification trees on a GPU , 2016, Concurr. Comput. Pract. Exp..