FPGA based accelerator for parallel DBSCAN algorithm

Data mining is playing a vital role in various application fields. One important issue in data mining is clustering, which is a process of grouping data with high similarity. Density-based clustering is an effective method that can find clusters in arbitrary shapes in feature space, and DBSCAN (Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise) is a basic one. With the tremendous increase of data sizes, the processing time taken by clustering algorithms can be several hours or more. In recent years, FPGA has provided a notable accelerating performance in data mining applications. In this paper, we study parallel DBSCAN algorithm and map it to FPGA based on the task-level and data-level parallelism architecture. Experimental results show that this accelerator can provide up to 86x speedup over a software implementation on general-purpose processor and 2.9x over a software implementation on graphic processor.

[1]  Viktor K. Prasanna,et al.  Efficient hardware data mining with the Apriori algorithm on FPGAs , 2005, 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05).

[2]  Viktor K. Prasanna,et al.  An Architecture for Efficient Hardware Data Mining using Reconfigurable Computing Systems , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[3]  James Theiler,et al.  Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware , 2001, FPGA '01.

[4]  Alok N. Choudhary,et al.  An FPGA Implementation of Decision Tree Classification , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[5]  IEEE Transactions on Parallel and Distributed Systems, Vol. 13 , 2002 .

[6]  Shuliang Wang,et al.  Data Mining and Knowledge Discovery , 2005, Mathematical Principles of the Internet.

[7]  Yue Qi,et al.  Accelerating Intersection Computation in Frequent Itemset Mining with FPGA , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[8]  Alok N. Choudhary,et al.  Design of a hardware accelerator for density based clustering applications , 2005, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05).

[9]  Di Ma,et al.  MR-DBSCAN: An Efficient Parallel Density-Based Clustering Algorithm Using MapReduce , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[10]  Christian Böhm,et al.  Density-based clustering using graphics processors , 2009, CIKM.

[11]  Hans-Peter Kriegel,et al.  A Fast Parallel Clustering Algorithm for Large Spatial Databases , 1999, Data Mining and Knowledge Discovery.

[12]  Hans-Peter Kriegel,et al.  Parallel Density-Based Clustering of Complex Objects , 2006, PAKDD.

[13]  Massimo Coppola,et al.  Experiments in Parallel Clustering with DBSCAN , 2001, Euro-Par.

[14]  Berkin Özisikyilmaz,et al.  Accelerating data mining workloads: current approaches and future challenges in system architecture design , 2011, WIREs Data Mining Knowl. Discov..

[15]  Tsutomu Maruyama,et al.  Performance comparison of FPGA, GPU and CPU in image processing , 2009, 2009 International Conference on Field Programmable Logic and Applications.