GPU Acceleration of Interior Point Methods in Large Scale SVM Training

The convex quadratic programming problem, involved in the large scale support vector machine (SVM) training phase, is computationally expensive. Interior Point Methods (IPM) have been used successfully to solve this problem. They have polynomial time complexity and maintain a constant predictable structure of the linear system that needs to solve each iteration in IPM. The main problem is its complexity both in workload and storage when it is used for real-life problems with millions of examples. This paper proposes an approach that significantly improves the performance of large scale SVM training on GPU-equipped cluster. It exploits the parallelism of IPM with Compute Unified Device Architecture (CUDA) on NVIDIA GTX480 GPUs. The dominant cost of several operations such as Cholesky Factorization (CF) motivates the implementation on GPU to yield further performance gains. The proposed solution allows efficient training on the large datasets, such as cover types, rcv1 and url. The speedup achieved with GPUs is about 3 over using only quad-core processors on our 5-node cluster. The equivalent speedup of a single node over LibSVM is about 90 times for the big dataset. It demonstrates that we can improve performance on clusters sufficiently by using GPUs in the large scale SVM training.

[1]  John E. Stone,et al.  GPU clusters for high-performance computing , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[2]  Krzysztof Sopyla,et al.  SVM with CUDA Accelerated Kernels for Big Sparse Problems , 2012, ICAISC.

[3]  Thanh-Nghi Do,et al.  A novel speed-up SVM algorithm for massive classification tasks , 2008, 2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies.

[4]  Shao-Yi Chien,et al.  Support Vector Machines on GPU with Sparse Matrix Format , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[5]  Ioannis Kompatsiaris,et al.  GPU acceleration for support vector machines , 2011, WIAMIS 2011.

[6]  Qi Li,et al.  An intelligent system for accelerating parallel SVM classification problems on large datasets using GPU , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[7]  Simon See,et al.  Solving Quadratic Programming Problems on Graphics Processing Unit , 2011 .

[8]  Hao Wang,et al.  PSVM : Parallelizing Support Vector Machines on Distributed Computers , 2007 .

[9]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[10]  Gene H. Golub,et al.  Matrix computations , 1983 .

[11]  Klaus Schulten,et al.  Adapting a message-driven parallel application to GPU-accelerated clusters , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[13]  Jacek Gondzio,et al.  GPU Acceleration of the Matrix-Free Interior Point Method , 2011, PPAM.

[14]  Michael C. Ferris,et al.  Interior-Point Methods for Massive Support Vector Machines , 2002, SIAM J. Optim..

[15]  Kurt Keutzer,et al.  Fast support vector machine training and classification on graphics processors , 2008, ICML '08.

[16]  Igor Durdanovic,et al.  Parallel Support Vector Machines: The Cascade SVM , 2004, NIPS.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  D. N. Ranasinghe,et al.  Accelerating high performance applications with CUDA and MPI , 2009, 2009 International Conference on Industrial and Information Systems (ICIIS).

[19]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[20]  Jack J. Dongarra,et al.  Towards dense linear algebra for hybrid GPU accelerated manycore systems , 2009, Parallel Comput..

[21]  Jacek Gondzio,et al.  Hybrid MPI/OpenMP Parallel Linear Support Vector Machine Training , 2009 .