Convex and concave hulls for classification with support vector machine

The training of a support vector machine (SVM) has the time complexity of O(n^3) with data number n. Normal SVM algorithms are not suitable for classification of large data sets. Convex hull can simplify SVM training, however the classification accuracy becomes lower when there are inseparable points. This paper introduces a novel method for SVM classification, called convex-concave hull. After grid pre-processing, the convex hull and the concave (non-convex) hull are found by Jarvis march method. Then the vertices of the convex-concave hull are applied for SVM training. The proposed convex-concave hull SVM classifier has distinctive advantages on dealing with large data sets with higher accuracy. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than the other training methods.

[1]  F. P. Preparata,et al.  Convex hulls of finite sets of points in two and three dimensions , 1977, CACM.

[2]  Michael Kallay,et al.  The Complexity of Incremental Convex Hull Algorithms in Rd , 1984, Inf. Process. Lett..

[3]  Boris Vladimirovič Gnedenko,et al.  Mathematical methods in the reliability theory , 1969 .

[4]  Maribel Yasmina Santos,et al.  Concave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points , 2007, GRAPP.

[5]  David J. Crisp,et al.  A Geometric Interpretation of v-SVM Classifiers , 1999, NIPS.

[6]  V. N. Malozemov,et al.  Finding the Point of a Polyhedron Closest to the Origin , 1974 .

[7]  Ronald L. Graham,et al.  An Efficient Algorithm for Determining the Convex Hull of a Finite Planar Set , 1972, Inf. Process. Lett..

[8]  Domenico Talia,et al.  P-AutoClass: Scalable Parallel Clustering for Mining Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..

[9]  Yuhua Li,et al.  Selecting training points for one-class support vector machines , 2011, Pattern Recognit. Lett..

[10]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Xiaoou Li,et al.  Support vector machine classification for large data sets via minimum enclosing ball clustering , 2008, Neurocomputing.

[13]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[14]  Ray A. Jarvis,et al.  On the Identification of the Convex Hull of a Finite Set of Points in the Plane , 1973, Inf. Process. Lett..

[15]  Jason Weston,et al.  Trading convexity for scalability , 2006, ICML.

[16]  Beng Chin Ooi,et al.  BORDER: efficient computation of boundary points , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[18]  Jiang-She Zhang,et al.  Reducing examples to accelerate support vector regression , 2007, Pattern Recognit. Lett..

[19]  S. Sathiya Keerthi,et al.  A fast iterative nearest point algorithm for support vector machine classifier design , 2000, IEEE Trans. Neural Networks Learn. Syst..

[20]  E. Gilbert An Iterative Procedure for Computing the Minimum of a Quadratic Form on a Convex Set , 1966 .

[21]  S. Sathiya Keerthi,et al.  Convergence of a Generalized SMO Algorithm for SVM Classifier Design , 2002, Machine Learning.

[22]  William F. Eddy,et al.  A New Convex Hull Algorithm for Planar Sets , 1977, TOMS.

[23]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[24]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[25]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[26]  Sergios Theodoridis,et al.  A geometric approach to Support Vector Machine (SVM) classification , 2006, IEEE Transactions on Neural Networks.

[27]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.