论文信息 - HardGBM: A Framework for Accurate and Hardware-Efficient Gradient Boosting Machines

HardGBM: A Framework for Accurate and Hardware-Efficient Gradient Boosting Machines

Gradient boosting machine (GBM) is a powerful and widely used type of ensemble machine learning methods, among which the most famous one is XGBoost (XGB). However, the cost of running large GBMs on hardware could become prohibitive given stringent resources. Ensemble reduction on boosting ensemble is intrinsically hard because member models are constructed in a sequential order, where the training targets for latter ones depend on the performance of the former ones. In this work, a GBM reduction framework is proposed for the first time to tackle the problem. For the first time, the framework supports automatic hardware implementation of regression tree ensembles. Experiments on 24 datasets from various applications demonstrate that our method reduces overall area utilization by 81.60% (80.64%) and power consumption by 21.15% (19.06%), while exceeding or successfully maintaining the performance level comparing with the original XGB (LightGBM) ensembles. In comparative experiments, to attain approximately the same accuracy level as our framework or XGB, deep learning-based solutions require ><inline-formula> <tex-math notation="LaTeX">$52.7\times $ </tex-math></inline-formula> footprints, <inline-formula> <tex-math notation="LaTeX">$6.0\times $ </tex-math></inline-formula> power consumption, and <inline-formula> <tex-math notation="LaTeX">$1.4\times $ </tex-math></inline-formula> training time. Equipped with tunable parameters, the framework is expected to seek a Pareto optimal front considering hardware resource limitation, accuracy and stability, and computation (training) efficiency.

Hai Jin | Xiangwei Wang | Hongfei Wang | Zhanfei Wu | Longyun Bian

[1] Zhi-Hua Zhou,et al. Towards Convergence Rate Analysis of Random Forests for Classification , 2022, NeurIPS.

[2] P. Franzon,et al. FAXID: FPGA-Accelerated XGBoost Inference for Data Centers using HLS , 2022, 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[3] Gonzalo Martínez-Muñoz,et al. An analysis of heuristic metrics for classifier ensemble pruning based on ordered aggregation , 2021, Pattern Recognit..

[4] Yanzhao Wu,et al. Boosting Deep Ensemble Performance with Hierarchical Pruning , 2021, 2021 IEEE International Conference on Data Mining (ICDM).

[5] Aws Albarghouthi,et al. Certifying Robustness to Programmable Data Bias in Decision Trees , 2021, NeurIPS.

[6] C. Scott,et al. An exact solver for the Weston-Watkins SVM subproblem , 2021, ICML.

[7] Javier Resano,et al. FPGA Accelerator for Gradient Boosting Decision Trees , 2021, Electronics.

[8] T. N. Vijaykumar,et al. Booster: An Accelerator for Gradient Boosting Decision Trees Training and Inference , 2020, 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[9] Gonzalo Martínez-Muñoz,et al. A comparative analysis of gradient boosting algorithms , 2020, Artificial Intelligence Review.

[10] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[11] David Gschwend,et al. ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network , 2020, ArXiv.

[12] Juan J. Alonso,et al. Investigating Performance Losses in High-Level Synthesis for Stencil Computations , 2020, 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[13] B. Cui,et al. Snapshot boosting: a fast ensemble framework for deep neural networks , 2019, Science China Information Sciences.

[14] Jianwen Li,et al. Hierarchical Ensemble Reduction and Learning for Resource-constrained Computing , 2019, ACM Trans. Design Autom. Electr. Syst..

[15] Ulrich Rührmair,et al. The Interpose PUF: Secure PUF Design against State-of-the-art Machine Learning Attacks , 2019, IACR Cryptol. ePrint Arch..

[16] Lior Rokach,et al. AugBoost: Gradient Boosting Enhanced with Step-Wise Feature Augmentation , 2019, IJCAI.

[17] Yiyu Shi,et al. Hardware/Software Co-Exploration of Neural Architectures , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[18] R. D. Blanton,et al. IC Protection Against JTAG-Based Attacks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[19] Jean-Pierre David,et al. Automated Synthesis of Streaming Transfer Level Hardware Designs , 2018, ACM Trans. Reconfigurable Technol. Syst..

[20] Andreas Gerstlauer,et al. DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21] Maryam Sabzevari,et al. Randomization vs Optimization in SVM Ensembles , 2018, ICANN.

[22] Jianwen Li,et al. Work-in-Progress: Hierarchical Ensemble Learning for Resource-Aware FPGA Computing , 2018, 2018 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[23] Edwin Hsing-Mean Sha,et al. Heterogeneous FPGA-Based Cost-Optimal Design for Timing-Constrained CNNs , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24] Diana Marculescu,et al. Quantized deep neural networks for energy efficient hardware-based inference , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[25] Tie-Yan Liu,et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[26] Anna Veronika Dorogush,et al. CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[27] Philipp Probst,et al. To tune or not to tune the number of trees in random forest? , 2017, J. Mach. Learn. Res..

[28] Xiang Lin,et al. Random Forest Architectures on FPGA for Multiple Applications , 2017, ACM Great Lakes Symposium on VLSI.

[29] Ji Feng,et al. Deep forest , 2017, IJCAI.

[30] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.

[31] R. D. Blanton,et al. Ensemble Reduction via Logic Minimization , 2016, TODE.

[32] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[33] Tianqi Chen,et al. XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[34] Rastislav J. R. Struharik,et al. Decision tree ensemble hardware accelerators for embedded applications , 2015, 2015 IEEE 13th International Symposium on Intelligent Systems and Informatics (SISY).

[35] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[36] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[37] Yang Yu,et al. Pareto Ensemble Pruning , 2015, AAAI.

[38] Christos-Savvas Bouganis,et al. Accelerating Random Forest training process using FPGA , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[39] John Langford,et al. Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.

[40] Xindong Wu,et al. Ensemble pruning via individual contribution ordering , 2010, KDD.

[41] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[42] Philip S. Yu,et al. Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[43] Alok Choudhary,et al. Interactive presentation: An FPGA implementation of decision tree classification , 2007 .

[44] William Nick Street,et al. Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[45] Gonzalo Martínez-Muñoz,et al. Pruning in ordered bagging ensembles , 2006, ICML.

[46] Wei Tang,et al. Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[47] J. Friedman. Stochastic gradient boosting , 2002 .

[48] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[49] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[50] Zhi-Hua Zhou,et al. Reconstruction-based Anomaly Detection with Completely Random Forest , 2021, SDM.

[51] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.

[52] L. Breiman. Random Forests , 2001, Machine Learning.

[53] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.