Selecting training sets for support vector machines: a review

Support vector machines (SVMs) are a supervised classifier successfully applied in a plethora of real-life applications. However, they suffer from the important shortcomings of their high time and memory training complexities, which depend on the training set size. This issue is especially challenging nowadays, since the amount of data generated every second becomes tremendously large in many domains. This review provides an extensive survey on existing methods for selecting SVM training data from large datasets. We divide the state-of-the-art techniques into several categories. They help understand the underlying ideas behind these algorithms, which may be useful in designing new methods to deal with this important problem. The review is complemented with the discussion on the future research pathways which can make SVMs easier to exploit in practice.

[1]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[2]  Osamu Watanabe,et al.  A Random Sampling Technique for Training Support Vector Machines , 2001, ALT.

[3]  Xiaoming Chang,et al.  An intelligent noise reduction method for chaotic signals based on genetic algorithms and lifting wavelet transforms , 2013, Inf. Sci..

[4]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[5]  Changyin Sun,et al.  Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data , 2015, Knowl. Based Syst..

[6]  Jun Zhu,et al.  Learning From Weakly Supervised Data by The Expectation Loss SVM (e-SVM) algorithm , 2014, NIPS.

[7]  Christophe Charrier,et al.  International Journal of Neural Systems Special Issue on Issue's Topic C World Scientific Publishing Company Tabu Search Model Selection for Svm , 2022 .

[8]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[9]  Stan Uryasev,et al.  Value-at-risk support vector machine: stability to outliers , 2013, Journal of Combinatorial Optimization.

[10]  Koby Crammer,et al.  Robust Support Vector Machine Training via Convex Outlier Ablation , 2006, AAAI.

[11]  Samia Boukir,et al.  Fast data selection for SVM training using ensemble margin , 2015, Pattern Recognit. Lett..

[12]  Eduardo Bayro-Corrochano,et al.  Improving Recurrent CSVM Performance for Robot Navigation on Discrete Labyrinths , 2009, CIARP.

[13]  B. Taskar,et al.  Learning from ambiguously labeled images , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Asdrúbal López Chau,et al.  Convex-Concave Hull for Classification with Support Vector Machine , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[15]  Ming Yang,et al.  Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[16]  Jakub Nalepa,et al.  An Alternating Genetic Algorithm for Selecting SVM Model and Training Set , 2017, MCPR.

[17]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[18]  Yaohua Tang,et al.  Efficient model selection for Support Vector Machine with Gaussian kernel function , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[19]  Defeng Wang,et al.  Selecting valuable training samples for SVMs via data structure analysis , 2008, Neurocomputing.

[20]  Akira Imada,et al.  Artificial Intelligence Evolved from Random Behaviour: Departure from the State of the Art , 2013, Artificial Intelligence, Evolutionary Computing and Metaheuristics.

[21]  Witold Pedrycz,et al.  A Competent Memetic Algorithm for Learning Fuzzy Cognitive Maps , 2015, IEEE Transactions on Fuzzy Systems.

[22]  Shigeo Abe,et al.  Fast Training of Support Vector Machines by Extracting Boundary Data , 2001, ICANN.

[23]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[24]  Samia Boukir,et al.  Support Vectors Selection for Supervised Learning Using an Ensemble Approach , 2010, 2010 20th International Conference on Pattern Recognition.

[25]  Jakub Nalepa,et al.  Adaptive Genetic Algorithm to Select Training Data for Support Vector Machines , 2014, EvoApplications.

[26]  Ivor W. Tsang,et al.  Convex and scalable weakly labeled SVMs , 2013, J. Mach. Learn. Res..

[27]  Sungzoon Cho,et al.  Pattern Selection for Support Vector Classifiers , 2002, IDEAL.

[28]  Wenjian Wang,et al.  A heuristic training for support vector regression , 2004, Neurocomputing.

[29]  Krzysztof Siminski Neuro-Fuzzy System Based Kernel for Classification with Support Vector Machines , 2013, ICMMI.

[30]  Jakub Nalepa,et al.  A memetic algorithm to select training data for support vector machines , 2014, GECCO.

[31]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[32]  Liu Wenyuan,et al.  The Training Set Selection Methods of microRNA Precursors Prediction Based on Machine Learning Approaches , 2013, 2013 Third International Conference on Intelligent System Design and Engineering Applications.

[33]  Hossam Faris,et al.  Bidirectional reservoir networks trained using SVM$$+$$+ privileged information for manufacturing process modeling , 2017, Soft Comput..

[34]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[35]  Leon N. Cooper,et al.  Selecting Data for Fast Support Vector Machines Training , 2007, Trends in Neural Computation.

[36]  Qinghua Hu,et al.  Neighborhood based sample and feature selection for SVM classification learning , 2011, Neurocomputing.

[37]  Robert Sabourin,et al.  A dynamic model selection strategy for support vector machine classifiers , 2012, Appl. Soft Comput..

[38]  Jakub Nalepa,et al.  Adaptive memetic algorithm for minimizing distance in the vehicle routing problem with time windows , 2016, Soft Comput..

[39]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Carl Gold,et al.  Model selection for support vector machine classification , 2002, Neurocomputing.

[41]  Manoranjan Paul,et al.  Human detection in surveillance videos and its applications - a review , 2013, EURASIP J. Adv. Signal Process..

[42]  Sungzoon Cho,et al.  Neighborhood PropertyBased Pattern Selection for Support Vector Machines , 2007, Neural Computation.

[43]  Ivica Dimitrovski,et al.  Content based image retrieval in medical applications: an improvement of the two-level architecture , 2009, IEEE EUROCON 2009.

[44]  Boguslaw Cyganek,et al.  Color Image Segmentation with Support Vector Machines: Applications to Road Signs Detection , 2008, Int. J. Neural Syst..

[45]  Haralampos-G. D. Stratigopoulos,et al.  Machine learning applications in IC testing , 2018, 2018 IEEE 23rd European Test Symposium (ETS).

[46]  Structural, Syntactic, and Statistical Pattern Recognition , 2002, Lecture Notes in Computer Science.

[47]  Thomas F. Coleman,et al.  Primal explicit max margin feature selection for nonlinear support vector machines , 2014, Pattern Recognit..

[48]  Krzysztof Krawiec,et al.  Hybrid coevolutionary algorithms vs. SVM algorithms , 2007, GECCO '07.

[49]  Leon N. Cooper,et al.  Training Data Selection for Support Vector Machines , 2005, ICNC.

[50]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[51]  Asdrúbal López Chau,et al.  Convex and concave hulls for classification with support vector machine , 2013, Neurocomputing.

[52]  Miroslaw Kowaluk,et al.  β-skeletons for a Set of Line Segments in R2 , 2015, FCT.

[53]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[54]  Abdesselam Bouzerdoum,et al.  Skin segmentation using color pixel classification: analysis and comparison , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[56]  Min Wang,et al.  Online Support Vector Machine Based on Convex Hull Vertices Selection , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[57]  Eduardo Bayro-Corrochano,et al.  Clifford Support Vector Machines for Classification, Regression, and Recurrence , 2010, IEEE Transactions on Neural Networks.

[58]  Larry J. Eshelman,et al.  The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination , 1990, FOGA.

[59]  Jianhua Xu,et al.  A SVM Model Selection Method Based on Hybrid Genetic Algorithm and Empirical Error Minimization Criterion , 2009, ISNN.

[60]  Markus Voelter,et al.  State of the Art , 1997, Pediatric Research.

[61]  Xin Chen,et al.  Large-scale support vector machine classification with redundant data reduction , 2016, Neurocomputing.

[62]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[63]  Jakub Nalepa,et al.  Support Vector Machines Training Data Selection Using a Genetic Algorithm , 2012, SSPR/SPR.

[64]  Simon Fong,et al.  Hierarchical classification in text mining for sentiment analysis of online news , 2014, Soft Computing.

[65]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[66]  Jakub Nalepa,et al.  Hand pose estimation using support vector machines with evolutionary training , 2014, IWSSIP 2014 Proceedings.

[67]  Zhi-Qiang Zeng,et al.  A geometric approach to train SVM on very large data sets , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[68]  Marek Pawelczyk,et al.  Controllability-oriented placement of actuators for active noise-vibration control of rectangular plates using a memetic algorithm , 2013 .

[69]  Rainer Stiefelhagen,et al.  Improved weak labels using contextual cues for person identification in videos , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[70]  Witold Pedrycz,et al.  Global Nonlinear Kernel Prediction for Large Data Set With a Particle Swarm-Optimized Interval Support Vector Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[71]  Jakub Nalepa,et al.  Dynamically Adaptive Genetic Algorithm to Select Training Data for SVMs , 2014, IBERAMIA.

[72]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[73]  Xiaoou Li,et al.  Support vector machine classification for large data sets via minimum enclosing ball clustering , 2008, Neurocomputing.

[74]  R. F. Woolson Wilcoxon Signed-Rank Test , 2008 .

[75]  HuQinghua,et al.  Neighborhood based sample and feature selection for SVM classification learning , 2011 .

[76]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[77]  Jakub Nalepa,et al.  Towards parameter-less support vector machines , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[78]  Andrea Clematis,et al.  A Hybrid Parallel Implementation of Model Selection for Support Vector Machines , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[79]  Kim-Kwang Raymond Choo,et al.  SVM or deep learning? A comparative study on remote sensing image classification , 2016, Soft Computing.

[80]  Irwin King,et al.  Locating support vectors via /spl beta/-skeleton technique , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[81]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[82]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[83]  Ludovic Duponchel,et al.  Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils. , 2014, Food chemistry.

[84]  Jakub Nalepa,et al.  The Smaller, the Better: Selecting Refined SVM Training Sets Using Adaptive Memetic Algorithm , 2016, GECCO.

[85]  Qinbao Song,et al.  A Multi-Label Learning Based Kernel Automatic Recommendation Method for Support Vector Machine , 2015, PloS one.

[86]  Stephen J. Wright,et al.  Big Data: Theoretical Aspects [Scanning the Issue] , 2016, Proc. IEEE.

[87]  Sean Luke,et al.  Evolving kernels for support vector machine classification , 2007, GECCO '07.

[88]  Ji Gao,et al.  Fast training Support Vector Machines using parallel sequential minimal optimization , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[89]  David Haussler,et al.  Proceedings of the fifth annual workshop on Computational learning theory , 1992, COLT 1992.

[90]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[91]  Fatemeh Alamdar,et al.  On-line twin independent support vector machines , 2016, Neurocomputing.

[92]  Wen Chang-ji Fast Pattern Selection for Support Vector Classifiers , 2007 .

[93]  José Francisco Martínez Trinidad,et al.  A review of instance selection methods , 2010, Artificial Intelligence Review.

[94]  Jakub Nalepa,et al.  Spatial-based skin detection using discriminative skin-presence features , 2014, Pattern Recognit. Lett..

[95]  Bartosz Krawczyk,et al.  Multidimensional data classification with chordal distance based kernel and Support Vector Machines , 2015, Eng. Appl. Artif. Intell..

[96]  Yuhua Li,et al.  Selecting Critical Patterns Based on Local Geometrical and Statistical Information , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[97]  Matthieu Cord,et al.  Scalable active learning strategy for object category retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[98]  Aníbal R. Figueiras-Vidal,et al.  Sample selection via clustering to construct support vector-like classifiers , 1999, IEEE Trans. Neural Networks.

[99]  Yunyan Duan,et al.  Learning With Auxiliary Less-Noisy Labels , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[100]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[101]  Fabrizio Angiulli,et al.  Fast condensed nearest neighbor rule , 2005, ICML.

[102]  Sven F. Crone,et al.  Genetic Algorithms for Support Vector Machine Model Selection , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[103]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[104]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  An evolutionary sampling approach for classification with imbalanced data , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[105]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[106]  Jinglu Hu,et al.  A fast SVM training method for very large datasets , 2009, 2009 International Joint Conference on Neural Networks.

[107]  Karolina Nurzynska,et al.  In Search of Truth: Analysis of Smile Intensity Dynamics to Detect Deception , 2016, IBERAMIA.

[108]  Pedro M. Ferreira,et al.  A simple algorithm for convex hull determination in high dimensions , 2013, 2013 IEEE 8th International Symposium on Intelligent Signal Processing.

[109]  Jakub Nalepa,et al.  Towards robust SVM training from weakly labeled large data sets , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[110]  Shuang Liu,et al.  Model selection of RBF kernel for C-SVM based on genetic algorithm and multithreading , 2012, 2012 International Conference on Machine Learning and Cybernetics.

[111]  José Francisco Martínez Trinidad,et al.  InstanceRank based on borders for instance selection , 2013, Pattern Recognit..

[112]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[113]  Qihui Wu,et al.  A survey of machine learning for big data processing , 2016, EURASIP Journal on Advances in Signal Processing.

[114]  Yuhua Li,et al.  Selecting training points for one-class support vector machines , 2011, Pattern Recognit. Lett..

[115]  Albert Ali Salah,et al.  Are You Really Smiling at Me? Spontaneous versus Posed Enjoyment Smiles , 2012, ECCV.

[116]  Jakub Nalepa,et al.  Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs , 2016, Neurocomputing.

[117]  Colin R. Reeves,et al.  Selection of Training Data for Neural Networks by a Genetic Algorithm , 1998, PPSN.

[118]  Eduardo Bayro-Corrochano,et al.  MIMO SVMs for 3D object classification , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[119]  Byoung-Tak Zhang,et al.  Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[120]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[121]  David H. Mathews,et al.  Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change , 2006, BMC Bioinformatics.

[122]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[123]  Francisco Herrera,et al.  Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study , 2003, IEEE Trans. Evol. Comput..

[124]  Takio Kurita,et al.  RANSAC-SVM for large-scale datasets , 2008, 2008 19th International Conference on Pattern Recognition.

[125]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[126]  Jason A. Laska,et al.  Randomized Sampling for Large Data Applications of SVM , 2012, 2012 11th International Conference on Machine Learning and Applications.

[127]  Igor Walukiewicz,et al.  Fundamentals of Computation Theory : 20th International Symposium, FCT 2015, Gdańsk, Poland, August 17-19, 2015, Proceedings , 2015 .

[128]  Annabella Astorino,et al.  Scaling Up Support Vector Machines Using Nearest Neighbor Condensation , 2010, IEEE Transactions on Neural Networks.

[129]  Paul Scheunders,et al.  High-dimensional clustering using frequency sensitive competitive learning , 1999, Pattern Recognit..

[130]  Frédéric Precioso,et al.  Improving SVM Training Sample Selection Using Multi-Objective Evolutionary Algorithm and LSH , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[131]  Jakub Nalepa,et al.  Adaptive memetic algorithm for the job shop scheduling problem , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[132]  Hujun Yin,et al.  Intelligent Data Engineering and Automated Learning — IDEAL 2002 , 2002, Lecture Notes in Computer Science.

[133]  Farid Melgani,et al.  Automatic Ground-Truth Validation With Genetic Algorithms for Multispectral Image Classification , 2009, IEEE Transactions on Geoscience and Remote Sensing.

[134]  Francisco Herrera,et al.  Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: Experimental evaluation and support vector analysis , 2016, Appl. Soft Comput..

[135]  Kate Smith-Miles,et al.  A meta-learning approach to automatic kernel selection for support vector machines , 2006, Neurocomputing.

[136]  Yuan-Hai Shao,et al.  A GA-based model selection for smooth twin parametric-margin support vector machine , 2013, Pattern Recognit..

[137]  Fabrizio Angiulli,et al.  Fast Nearest Neighbor Condensation for Large Data Sets Classification , 2007, IEEE Transactions on Knowledge and Data Engineering.

[138]  Bernhard Schölkopf,et al.  A Compression Approach to Support Vector Model Selection , 2004, J. Mach. Learn. Res..

[139]  Sergios Theodoridis,et al.  A hierarchical feature fusion framework for adaptive visual tracking , 2011, Image Vis. Comput..

[140]  M. Kawulok,et al.  Genetic algorithms for classifiers' training sets optimisation applied to human face recognition , 2007 .

[141]  José Sergio Ruiz Castilla,et al.  Data selection based on decision tree for SVM classification on large data sets , 2015, Appl. Soft Comput..

[142]  Bartosz Krawczyk,et al.  Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets , 2016, Pattern Recognit..

[143]  Min-Yuan Cheng,et al.  Optimizing parameters of support vector machine using fast messy genetic algorithm for dispute classification , 2014, Expert Syst. Appl..

[144]  Antônio de Pádua Braga,et al.  SVM-KM: speeding SVMs learning with a priori cluster selection and k-means , 2000, Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks.

[145]  Mumin Song,et al.  A Novel Fast Training Method for SVM and Its Application in Fault Diagnosis of Service Robot , 2015, Int. J. Online Eng..

[146]  S. Halgamuge,et al.  Reducing the Number of Training Samples for Fast Support Vector Machine Classification , 2004 .