Identifying uncertainty regions in Support Vector Machines using geometric margin and convex hulls

Like most classification techniques, the existing support vector machines (SVM) approaches are challenged to correctly classify their input when the data points are either very close to the decision boundary or very dissimilar from the training data set. In both situations, most classifiers including SVMs will still give a prediction by assigning the test point to one of the classes. However, when a test instance is very close to the decision boundary, the side of the boundary on which the instance lies, and hence the predicted class, will depend in many instances more on the choices of the tuning or training parameters rather than a clear differences in features. Furthermore, if a test instance is substantially different from all instances used during the training, the classical SVM classifiers will still assign it to a class although there is little evidence to support such assignment. In both cases, it is very useful for a classifier to be able to assess its ability to classify a given instance by identifying those regions of the feature space in which the class assignments are less certain. In this paper, we propose two novel approaches based on: i) a geometric uncertainty margin and ii) the convex hulls of the training points in the feature space. Our proposed techniques improve upon the existing SVM-based approaches by adding the ability to identify ldquouncertaintyrdquo areas where the assignment of a test instance to a class cannot be guaranteed. We illustrate both the problems and our novel techniques on the Iris data set from the UCI machine learning repository.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  Ioannis Pitas,et al.  Real time facial expression recognition from image sequences using support vector machines , 2005, IEEE International Conference on Image Processing 2005.

[3]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[4]  Panos M. Pardalos,et al.  Linear programming approaches to the convex hull problem in Rm , 1995 .

[5]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Osberth De Castro,et al.  Convex Hull in Feature Space for Support Vector Machines , 2002, IBERAMIA.

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  F. P. Preparata,et al.  Convex hulls of finite sets of points in two and three dimensions , 1977, CACM.

[10]  Padraig Cunningham,et al.  Generating Estimates of Classification Confidence for a Case-Based Spam Filter , 2005, ICCBR.

[11]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[12]  Xin Dong,et al.  Speaker recognition using continuous density support vector machines , 2001 .

[13]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[14]  Satoshi Nakamura,et al.  Out-of-domain detection based on confidence measures from multiple topic classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  John Hallam,et al.  IEEE International Joint Conference on Neural Networks , 2005 .