Distributed Training of Support Vector Machine on a Multiple-FPGA System

Support Vector Machine (SVM) is a supervised machine learning model for classification tasks. Training SVM on a large number of data samples is challenging due to the high computational cost and memory requirement. Hence, model training is supported on a high-performance server which typically runs a sequential training algorithm on centralized data. However, as we move towards massive workloads, it will be impossible to store all the data in a centralized manner and expect such sequential training algorithms to scale on traditional processors. Moreover, with the growing demands of real-time machine learning for edge analytics, it is imperative to devise an efficient training framework with relatively cheaper computations and limited memory. Therefore, we propose and implement a first-of-its-kind system of multiple FPGAs as a distributed computing framework comprising up to eight FPGA units on Amazon F1 instances with negligible communication overhead to fully parallelize, accelerate, and scale the SVM training on decentralized data. Each FPGA unit has a pipelined SVM training IP logic core operating at 125 MHz with a power dissipation of 39 Watts for accelerating its allocated computations in the overall training process. We evaluate and compare the performance of the proposed system on five real SVM benchmarks.

[1]  Conrad Sanderson,et al.  Armadillo: a template-based C++ library for linear algebra , 2016, J. Open Source Softw..

[2]  Minho Lee,et al.  Deep Network with Support Vector Machines , 2013, ICONIP.

[3]  Tim Menzies,et al.  500+ Times Faster than Deep Learning: (A Case Study Exploring Faster Methods for Text Mining StackOverflow) , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[4]  Lingfeng Wang,et al.  FPGA Implementation of a Support Vector Machine Based Classification System and Its Potential Application in Smart Grid , 2014, 2014 11th International Conference on Information Technology: New Generations.

[5]  Jyotikrishna Dass,et al.  Fast and Communication-Efficient Algorithm for Distributed Support Vector Machine Training , 2019, IEEE Transactions on Parallel and Distributed Systems.

[6]  Christos-Savvas Bouganis,et al.  Novel Cascade FPGA Accelerator for Support Vector Machines Classification , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Kim-Kwang Raymond Choo,et al.  SVM or deep learning? A comparative study on remote sensing image classification , 2016, Soft Computing.

[8]  Marta Ruiz-Llata,et al.  Classification and regression , 1997 .

[9]  Dino Isa,et al.  Efficient non-iterative fixed-period SVM training architecture for FPGAs , 2013, IECON 2013 - 39th Annual Conference of the IEEE Industrial Electronics Society.

[10]  Inderjit S. Dhillon,et al.  Memory Efficient Kernel Approximation , 2014, ICML.

[11]  Christos-Savvas Bouganis,et al.  FPGA based nonlinear Support Vector Machine training using an ensemble learning , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).

[12]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[13]  Torsten Wilde,et al.  Predicting the Energy and Power Consumption of Strong and Weak Scaling HPC Applications , 2014, Supercomput. Front. Innov..

[14]  Christos-Savvas Bouganis,et al.  A scalable FPGA architecture for non-linear SVM training , 2008, 2008 International Conference on Field-Programmable Technology.

[15]  Tamer Shanableh,et al.  FPGA-Based Parallel Hardware Architecture for Real-Time Image Classification , 2015, IEEE Transactions on Computational Imaging.

[16]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[17]  Srihari Cadambi,et al.  A Massively Parallel FPGA-Based Coprocessor for Support Vector Machines , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[18]  Davide Anguita,et al.  A digital architecture for support vector machines: theory, algorithm, and FPGA implementation , 2003, IEEE Trans. Neural Networks.

[19]  Christopher Leckie,et al.  High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning , 2016, Pattern Recognit..

[20]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[21]  Jun Guo,et al.  A Deep Learning Method Combined Sparse Autoencoder with SVM , 2015, 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.