Hierarchical Ensemble Reduction and Learning for Resource-constrained Computing

Generic tree ensembles (such as Random Forest, RF) rely on a substantial amount of individual models to attain desirable performance. The cost of maintaining a large ensemble could become prohibitive in applications where computing resources are stringent. In this work, a hierarchical ensemble reduction and learning framework is proposed. Experiments show our method consistently outperforms RF in terms of both accuracy and retained ensemble size. In other words, ensemble reduction is achieved with enhancement in accuracy rather than degradation. The method can be executed efficiently, up to >590× time reduction than a recent ensemble reduction work. We also developed Boolean logic encoding techniques to directly tackle multiclass problems. Moreover, our framework bridges the gap between software-based ensemble methods and hardware computing in the IoT era. We developed a novel conversion paradigm that supports the automatic deployment of >500 trees on a chip. Our proposed method reduces power consumption and overall area utilization by >21.5% and >62%, respectively, comparing with RF. The hierarchical approach provides rich opportunities to balance between the computation (training and response time), the hardware resource (memory and energy), and accuracy.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  R. D. Blanton,et al.  Improving accuracy of on-chip diagnosis via incremental learning , 2015, 2015 IEEE 33rd VLSI Test Symposium (VTS).

[3]  Xiang Lin,et al.  Random Forest Architectures on FPGA for Multiple Applications , 2017, ACM Great Lakes Symposium on VLSI.

[4]  James Myers,et al.  F1: Intelligent energy-efficient systems at the edge of IoT , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[5]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[6]  Hillol Kargupta,et al.  A Fourier spectrum-based approach to represent decision trees for mining data streams in mobile environments , 2004, IEEE Transactions on Knowledge and Data Engineering.

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[9]  Xindong Wu,et al.  Ensemble pruning via individual contribution ordering , 2010, KDD.

[10]  R. D. Blanton,et al.  Detection of illegitimate access to JTAG via statistical learning in chip , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Maryam Sabzevari,et al.  Vote-boosting ensembles , 2016, Pattern Recognit..

[12]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[13]  Sabrina Hirsch,et al.  Logic Minimization Algorithms For Vlsi Synthesis , 2016 .

[14]  Haibo Zeng,et al.  Minimizing stack memory for hard real-time applications on multicore platforms , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  William Nick Street,et al.  Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[16]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[17]  Jean-Pierre David,et al.  Automated Synthesis of Streaming Transfer Level Hardware Designs , 2018, ACM Trans. Reconfigurable Technol. Syst..

[18]  Gonzalo Martínez-Muñoz,et al.  Pruning in ordered bagging ensembles , 2006, ICML.

[19]  Swagath Venkataramani,et al.  Invited: Accelerator design for deep learning training , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[20]  Jianwen Li,et al.  Work-in-Progress: Hierarchical Ensemble Learning for Resource-Aware FPGA Computing , 2018, 2018 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[21]  Lior Rokach,et al.  Ensemble Learning - Pattern Classification Using Ensemble Methods, 2nd Edition , 2019, Ensemble Learning.

[22]  Farinaz Koushanfar,et al.  Deep3: Leveraging three levels of parallelism for efficient Deep Learning , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[23]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[24]  Andreas Gerstlauer,et al.  DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[25]  Hongfei Wang,et al.  Improving Test and Diagnosis Efficiency through Ensemble Reduction and Learning , 2019, ACM Trans. Design Autom. Electr. Syst..

[26]  Christos-Savvas Bouganis,et al.  Accelerating Random Forest training process using FPGA , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[27]  Ji Feng,et al.  Deep Forest: Towards An Alternative to Deep Neural Networks , 2017, IJCAI.

[28]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[29]  Trevor Hastie,et al.  The Error Coding Method and PICTs , 1998 .

[30]  Alok Choudhary,et al.  Interactive presentation: An FPGA implementation of decision tree classification , 2007 .

[31]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[32]  R. D. Blanton,et al.  Ensemble Reduction via Logic Minimization , 2016, TODE.

[33]  Rastislav J. R. Struharik,et al.  Decision tree ensemble hardware accelerators for embedded applications , 2015, 2015 IEEE 13th International Symposium on Intelligent Systems and Informatics (SISY).

[34]  Nicolás García-Pedrajas,et al.  An empirical study of binary classifier fusion methods for multiclass classification , 2011, Inf. Fusion.

[35]  Peng Zhang,et al.  Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[36]  Marios S. Pattichis,et al.  Pipelined Decision Tree Classification Accelerator Implementation in FPGA (DT-CAIF) , 2015, IEEE Transactions on Computers.

[37]  John Langford,et al.  Scaling up machine learning: parallel and distributed approaches , 2011, KDD '11 Tutorials.