SAKMA: Specialized FPGA-Based Accelerator Architecture for Data-Intensive K-Means Algorithms

In the era of BD explosion, poses significant challenges in the processing speed due to huge data volume and high dimension. To address this problem, we design a hardware implementation of K-means based on FPGA, named SAKMA, which can accelerate the whole algorithm in hardware and can be easily configured via parameters. What's more, the accelerator makes the data size unlimited and can solve the problem about frequent off-chip memory access in a certain extent. Taking into account the hardware resource and power consumption, the SAKMA architecture adopts novel methods to accelerate the algorithm, including pipeline, tile technique, duplication parallelism, and hardware adder tree structures. In order to evaluate the performance of accelerator, we have constructed a real hardware prototype on Xilinx ZedBoard xc7z020clg484-1 FPGA. Experimental results demonstrate that the SAKMA architecture can achieve the speedup at 20.5i?ź×i?źwith the affordable hardware cost.

[1]  James Theiler,et al.  Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware , 2001, FPGA '01.

[2]  George A. Constantinides,et al.  FPGA-based K-means clustering using tree-based data structures , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[3]  Huseyin Seker,et al.  FPGA implementation of K-means algorithm for bioinformatics application: An accelerated approach to clustering Microarray data , 2011, 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).

[4]  P. F. Macgregor,et al.  Application of microarrays to the analysis of gene expression in cancer. , 2002, Clinical chemistry.

[5]  Hong Yu,et al.  Heterogeneous Cloud Framework for Big Data Genome Sequencing , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  James Theiler,et al.  Design issues for hardware implementation of an algorithm for segmenting hyperspectral imagery , 2000, SPIE Optics + Photonics.

[7]  Kazuki Ichikawa,et al.  A Simple but Powerful Heuristic Method for Accelerating $k$ -Means Clustering of Large-Scale Data in Life Science , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Chao Wang,et al.  SODA: Software defined FPGA based accelerators for big data , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  Paul Chow,et al.  K-means implementation on FPGA for high-dimensional data using triangle inequality , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[10]  Xuehai Zhou,et al.  PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.

[11]  Dominique Lavenier,et al.  Experience with a Hybrid Processor: K-Means Clustering , 2004, The Journal of Supercomputing.

[12]  L. Childs A concrete introduction to higher algebra , 1978 .

[13]  Metin Akay Genomics and proteomics engineering in medicine and biology , 2006 .