Parallel Dual Coordinate Descent Method for Large-scale Linear Classification in Multi-core Environments

Dual coordinate descent method is one of the most effective approaches for large-scale linear classification. However, its sequential design makes the parallelization difficult. In this work, we target at the parallelization in a multi-core environment. After pointing out difficulties faced in some existing approaches, we propose a new framework to parallelize the dual coordinate descent method. The key idea is to make the majority of all operations (gradient calculation here) parallelizable. The proposed framework is shown to be theoretically sound. Further, we demonstrate through experiments that the new framework is robust and efficient in a multi-core environment.

[1]  Clifford Hildreth,et al.  A quadratic programming procedure , 1957 .

[2]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[3]  Peter Richtárik,et al.  Distributed Mini-Batch SDCA , 2015, ArXiv.

[4]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[5]  I. Song,et al.  Working Set Selection Using Second Order Information for Training Svm, " Complexity-reduced Scheme for Feature Extraction with Linear Discriminant Analysis , 2022 .

[6]  Inderjit S. Dhillon,et al.  PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent , 2015, ICML.

[7]  Chih-Jen Lin,et al.  Fast Matrix-Vector Multiplications for Large-Scale Logistic Regression on Shared-Memory Systems , 2015, 2015 IEEE International Conference on Data Mining.

[8]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[9]  Dan Roth,et al.  Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM , 2015, ICML.

[10]  Lin Xiao,et al.  Scaling Up Stochastic Dual Coordinate Ascent , 2015, KDD.

[11]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[12]  Chih-Jen Lin,et al.  A Simple Decomposition Method for Support Vector Machines , 2002, Machine Learning.

[13]  Tianbao Yang,et al.  Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent , 2013, NIPS.

[14]  Chih-Jen Lin,et al.  Decomposition Methods for Linear Support Vector Machines , 2003, Neural Computation.

[15]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[16]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[17]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..