论文信息 - Random Local SVMs for Classifying Large Datasets

Random Local SVMs for Classifying Large Datasets

We propose a new parallel ensemble learning algorithm of random local support vector machines, called krSVM for the effectively non-linear classification of large datasets. The random local SVM in the krSVM learning strategy uses kmeans algorithm to partition the data into k clusters, followed which it constructs a non-linear SVM in each cluster to classify the data locally in the parallel way on multi-core computers. The krSVM algorithm is faster than the standard SVM in the non-linear classification of large datasets while maintaining the classification correctness. The numerical test results on 4 datasets from UCI repository and 3 benchmarks of handwritten letters recognition showed that our proposed algorithm is efficient compared to the standard SVM.

François Poulet | Thanh-Nghi Do | Thanh-Nghi Do | F. Poulet

[1] François Poulet,et al. Incremental SVM and Visualization Tools for Bio- medical Data Mining , 2003 .

[2] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3] Philip S. Yu,et al. Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[4] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[5] Daniel Boley,et al. Training Support Vector Machines Using Adaptive Clustering , 2004, SDM.

[6] Jiawei Han,et al. Clustered Support Vector Machines , 2013, AISTATS.

[7] Laurens van der Maaten,et al. A New Benchmark Dataset for Handwritten Character Recognition , 2009 .

[8] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[9] David R. Musicant,et al. Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[10] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[11] Vojislav Kecman,et al. Adaptive local hyperplane classification , 2008, Neurocomputing.

[12] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[13] Philip H. S. Torr,et al. Locally Linear Support Vector Machines , 2011, ICML.

[14] Jiawei Han,et al. Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[15] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16] François Poulet,et al. Mining Very Large Datasets with Support Vector Machine Algorithms , 2003, ICEIS.

[17] Glenn Fung,et al. Incremental Support Vector Machine Classification , 2002, SDM.

[18] Léon Bottou,et al. Local Learning Algorithms , 1992, Neural Computation.

[19] François Poulet,et al. Mining Very Large Datasets with SVM and Visualization , 2005, ICEIS.

[20] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[21] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .

[22] François Poulet,et al. Classifying one billion data with a new distributed svm algorithm , 2006, 2006 International Conference onResearch, Innovation and Vision for the Future.

[23] L. Breiman. Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[24] Federico Girosi,et al. An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[25] Jianchang Mao,et al. Scaling-up support vector machines using boosting algorithm , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[26] Vojislav Kecman,et al. Locally linear support vector machines and other local models , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[27] Somnath Banerjee. Boosting inductive transfer for text classification using wikipedia , 2007, ICMLA 2007.

[28] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[29] Jean-Daniel Fekete,et al. Large Scale Classification with Support Vector Machine Algorithms , 2007, ICMLA 2007.

[30] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .

[31] Pascal Vincent,et al. K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[32] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[33] Thanh-Nghi Do,et al. Non-linear Classification of Massive Datasets with a Parallel Algorithm of Local Support Vector Machines , 2015, ICCSAMA.

[34] Léon Bottou,et al. Local Algorithms for Pattern Recognition and Dependencies Estimation , 1993, Neural Computation.

[35] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[36] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[37] Glenn Fung,et al. Proximal support vector machine classifiers , 2001, KDD '01.

[38] Thanh-Nghi Do,et al. Parallel multiclass stochastic gradient descent algorithms for classifying million images with very-high-dimensional signatures into thousands classes , 2014, Vietnam Journal of Computer Science.

[39] Jitendra Malik,et al. SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[40] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[41] François Poulet,et al. Speed Up SVM Algorithm for Massive Classification Tasks , 2008, ADMA.

[42] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[43] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[44] François Poulet,et al. Towards High Dimensional Data Mining with Boosting of PSVM and Visualization Tools , 2004, ICEIS.

[45] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[46] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[47] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[48] Rong Jin,et al. Efficient Algorithm for Localized Support Vector Machine , 2010, IEEE Transactions on Knowledge and Data Engineering.

[49] Enrico Blanzieri,et al. Fast and Scalable Local Kernel Machines , 2010, J. Mach. Learn. Res..

[50] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[51] Daphne Koller,et al. Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.