FPGA implementations of data mining algorithms

In recent decades there has been an exponential growth in quantity of collected data. Various data mining procedures have been developed to extract information from such large amounts of data. Handling ever increasing amount of data generates increasing demand for computing power. There are several ways of dealing with this demand, such as multiprocessor systems, and use of graphic processing units (GPU). Another way is use of field programmable gate array (FPGA) devices as hardware accelerators. This paper gives a survey of the application of FPGAs as hardware accelerators for data mining. Three data mining algorithms were selected for this survey: classification and regression trees, support vector machines, and k-means clustering. A literature review and analysis of FPGA implementations was conducted for the three selected algorithms. Conclusions on methods of implementation, common problems and limitations, and means of overcoming them were drawn from the analysis.

[1]  Davide Anguita,et al.  A digital architecture for support vector machines: theory, algorithm, and FPGA implementation , 2003, IEEE Trans. Neural Networks.

[2]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[3]  Guangquan Zhao,et al.  Accelerating on-line training of LS-SVM with run-time reconfiguration , 2011, 2011 International Conference on Field-Programmable Technology.

[4]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[5]  Tsutomu Maruyama Real-time K-Means Clustering for Color Images on Reconfigurable Hardware , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[6]  Rasmus Ulslev Pedersen,et al.  An Embedded Support Vector Machine , 2006, 2006 International Workshop on Intelligent Solutions in Embedded Systems.

[7]  John W. Lockwood,et al.  High Speed Document Clustering in Reconfigurable Hardware , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[8]  James Theiler,et al.  Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware , 2001, FPGA '01.

[9]  Dominique Lavenier,et al.  Experience with a Hybrid Processor: K-Means Clustering , 2004, The Journal of Supercomputing.

[10]  Hua-feng Chen,et al.  A parallel and scalable digital architecture for training support vector machines , 2009, Journal of Zhejiang University SCIENCE C.

[11]  Miriam Leeser,et al.  K-means Clustering for Multispectral Images Using Floating-Point Divide , 2007 .

[12]  Christos-Savvas Bouganis,et al.  A Heterogeneous FPGA Architecture for Support Vector Machine Training , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[13]  Tsutomu Maruyama,et al.  An FPGA implementation of real-time K-means clustering for color images , 2007, Journal of Real-Time Image Processing.

[14]  John A. Chandy,et al.  Active Storage Networks for Accelerating K-Means Data Clustering , 2011, ARC.

[15]  Martin Schoeberl,et al.  JOP: A Java Optimized Processor for Embedded Real-Time Systems , 2008 .

[16]  Srihari Cadambi,et al.  A Massively Parallel FPGA-Based Coprocessor for Support Vector Machines , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[17]  Kevin Skadron,et al.  Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.

[18]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[19]  Huseyin Seker,et al.  FPGA implementation of K-means algorithm for bioinformatics application: An accelerated approach to clustering Microarray data , 2011, 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).

[20]  K. Clint Slatton,et al.  Accelerating Machine-Learning Algorithms on FPGAs using Pattern-Based Decomposition , 2011, J. Signal Process. Syst..

[21]  Alok N. Choudhary,et al.  An FPGA Implementation of Decision Tree Classification , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[22]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..