Engineering multilevel support vector machines

The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data. In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes. Typically, nonlinear kernels produce significantly higher classification quality to linear kernels but introduce extra kernel and model parameters which requires computationally expensive fitting. This increases the quality but also reduces the performance dramatically. We introduce a generalized fast multilevel framework for regular and weighted SVM and discuss several versions of its algorithmic components that lead to a good trade-off between quality and time. Our framework is implemented using PETSc which allows an easy integration with scientific computing tasks. The experimental results demonstrate significant speed up compared to the state-of-the-art nonlinear SVM libraries. Reproducibility: our source code, documentation and parameters are available at https:// this http URL.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[3]  Ilya Safro,et al.  Multilevel algorithms for linear ordering problems , 2009, JEAL.

[4]  Su-Yun Huang,et al.  Model selection for support vector machines via uniform design , 2007, Comput. Stat. Data Anal..

[5]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[6]  Yousef Saad,et al.  Multilevel manifold learning with application to spectral clustering , 2010, CIKM.

[7]  Zhi-Bo Zhu,et al.  Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis , 2010 .

[8]  Igor Durdanovic,et al.  Parallel Support Vector Machines: The Cascade SVM , 2004, NIPS.

[9]  Shi-Jinn Horng,et al.  A novel intrusion detection system based on hierarchical clustering and support vector machines , 2011, Expert Syst. Appl..

[10]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[11]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[12]  F. Tay,et al.  Application of support vector machines in financial time series forecasting , 2001 .

[13]  Giorgio Vittadini,et al.  Multilevel dimensionality-reduction methods , 2013, Stat. Methods Appl..

[14]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[15]  Kristof Coussement,et al.  Faculteit Economie En Bedrijfskunde Hoveniersberg 24 B-9000 Gent Churn Prediction in Subscription Services: an Application of Support Vector Machines While Comparing Two Parameter-selection Techniques Churn Prediction in Subscription Services: an Application of Support Vector Machines While Comparin , 2022 .

[16]  Andreas Noack,et al.  Multilevel local search algorithms for modularity clustering , 2011, JEAL.

[17]  Senjian An,et al.  Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression , 2007, Pattern Recognit..

[18]  Le Song,et al.  CA-SVM: Communication-Avoiding Support Vector Machines on Distributed Systems , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[19]  Ronen Basri,et al.  Hierarchy and adaptivity in segmenting visual scenes , 2006, Nature.

[20]  Shuaiwen Song,et al.  Scaling Support Vector Machines on modern HPC platforms , 2015, J. Parallel Distributed Comput..

[21]  Soo-Young Lee,et al.  Support Vector Machines with Binary Tree Architecture for Multi-Class Classification , 2004 .

[22]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[23]  Sven Leyffer,et al.  Fast response to infection spread and cyber attacks on large-scale networks , 2012, J. Complex Networks.

[24]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[25]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[26]  Changjian Wang,et al.  Multi-Modes Cascade SVMs: Fast Support Vector Machines in Distributed System , 2017, ICISA.

[27]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[28]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[29]  Robert Sabourin,et al.  Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs , 2010, Pattern Recognit..

[30]  Zne-Jung Lee,et al.  Parameter determination of support vector machine and feature selection using simulated annealing approach , 2008, Appl. Soft Comput..

[31]  Nicolas Baskiotis,et al.  Hierarchical label partitioning for large scale classification , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[32]  Bhavani M. Thuraisingham,et al.  A new intrusion detection system using support vector machines and hierarchical clustering , 2007, The VLDB Journal.

[33]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[34]  Achi Brandt,et al.  Fast multiscale clustering and manifold identification , 2006, Pattern Recognit..

[35]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[36]  Peter Sanders,et al.  Recent Advances in Graph Partitioning , 2013, Algorithm Engineering.

[37]  Peter Sanders,et al.  Advanced Coarsening Schemes for Graph Partitioning , 2012, SEA.

[38]  Andreas Noack,et al.  Multi-level Algorithms for Modularity Clustering , 2008, SEA.

[39]  M. Narasimha Murty,et al.  Scalable non-linear Support Vector Machine using hierarchical clustering , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[40]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Sven F. Crone,et al.  Genetic Algorithms for Support Vector Machine Model Selection , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[42]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[43]  Lei Wang,et al.  Feature Selection with Kernel Class Separability , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Ilya Safro,et al.  Algebraic Distance on Graphs , 2011, SIAM J. Sci. Comput..

[45]  D. Ron,et al.  Multigrid Solvers and Multilevel Optimization Strategies , 2003 .

[46]  Inderjit S. Dhillon,et al.  A Divide-and-Conquer Solver for Kernel Support Vector Machines , 2013, ICML.

[47]  Zhongyi Hu,et al.  A PSO and pattern search based memetic algorithm for SVMs parameters optimization , 2013, Neurocomputing.

[48]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[49]  Johan A. K. Suykens,et al.  EnsembleSVM: a library for ensemble learning using support vector machines , 2014, J. Mach. Learn. Res..

[50]  J. Suykens,et al.  A tutorial on support vector machine-based methods for classification problems in chemometrics. , 2010, Analytica chimica acta.

[51]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[52]  Tao Li,et al.  HPSVM: Heterogeneous Parallel SVM with Factorization Based IPM Algorithm on CPU-GPU Cluster , 2016, 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).

[53]  Ehsan Sadrfaridpour,et al.  Algebraic multigrid support vector machines , 2016, ESANN.

[54]  Ilya Safro,et al.  Scalable Multilevel Support Vector Machines , 2015, ICCS.

[55]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[56]  Xiaoli Zhang,et al.  An ACO-based algorithm for parameter optimization of support vector machines , 2010, Expert Syst. Appl..

[57]  Zheng Chen,et al.  P-packSVM: Parallel Primal grAdient desCent Kernel SVM , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[58]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[59]  Jacek M. Zurada,et al.  Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance , 2008, Neural Networks.

[60]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[61]  Thomas A. Manteuffel,et al.  An energy‐based AMG coarsening strategy , 2006, Numer. Linear Algebra Appl..

[62]  Ding-Xuan Zhou,et al.  SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming , 2005, Neural Computation.

[63]  Ilya Safro,et al.  Comparison of Coarsening Schemes for Multilevel Graph Partitioning , 2009, LION.

[64]  Ilya Safro,et al.  Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values , 2016, PloS one.

[65]  Inderjit S. Dhillon,et al.  A fast kernel-based multilevel algorithm for graph clustering , 2005, KDD '05.

[66]  Kin Keung Lai,et al.  Credit scoring using support vector machines with direct search for parameters selection , 2008, Soft Comput..

[67]  Q. Henry Wu,et al.  Association Rule Mining-Based Dissolved Gas Analysis for Fault Diagnosis of Power Transformers , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[68]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[69]  Jung-Hsien Chiang,et al.  Hierarchically SVM classification based on support vector clustering method and its application to document categorization , 2007, Expert Syst. Appl..

[70]  Ilya Safro,et al.  Relaxation-based coarsening and multiscale graph organization , 2010, Multiscale Model. Simul..

[71]  Thomas G. Dietterich Overfitting and undercomputing in machine learning , 1995, CSUR.

[72]  Ilya Safro,et al.  Multiscale approach for the network compression-friendly ordering , 2010, J. Discrete Algorithms.

[73]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..