Accelerating Random Forest training process using FPGA

Random Forest (RF) is one of the state-of-art supervised learning methods in Machine Learning and inherently consists of two steps: the training and the evaluation step. In applications where the system needs to be updated periodically, the training step becomes the bottleneck of the system, imposing hard constraints on its adaptability to a changing environment. In this work, a novel FPGA architecture for accelerating the RF training step is presented, exploring key features of the device. By combing a fine-grain data-flow processing at low-level and by exploiting parallelism available at high level inherent in the algorithm, significant acceleration factors are achieved. Key to the above gains is a novel FPGA FIFO based merge sorter module, a core component in the architecture, that exhibits high efficiency in memory utilisation; as well as a batch training strategy that enable full exploitation of the high memory bandwidth offered by the on-chip memory featured on FPGA devices. The proposed system achieves speed-up factors of up to 230x over a 3GHz Intel Core i5 processor when an Altera Stratix IV device is utilised under classification problems drawn from VOC2007.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[3]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[4]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[5]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[6]  Jim Tørresen,et al.  FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on fpgas for large problem sorting , 2011, FPGA '11.

[7]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007 .

[8]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[9]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Learning from Imbalanced Data Using Random Forest , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[10]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[11]  Wayne Luk,et al.  Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study , 2010, IEEE Transactions on Computers.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Alok N. Choudhary,et al.  An FPGA Implementation of Decision Tree Classification , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Horácio C. Neto,et al.  Sorting Units for FPGA-Based Embedded Systems , 2008, DIPES.

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  Toby Sharp,et al.  Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[17]  Henrik Boström Concurrent Learning of Large-Scale Random Forests , 2011, SCAI.

[18]  Christos-Savvas Bouganis,et al.  A Heterogeneous FPGA Architecture for Support Vector Machine Training , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[19]  Andrew Hunter,et al.  FPGA implementation of Naive Bayes classifier for visual object recognition , 2011, CVPR 2011 WORKSHOPS.

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.