论文信息 - F support vector machines

F support vector machines

We introduce in this paper Fβ SVMs, a new parametrization of support vector machines. It allows to optimize a SVM in terms of Fβ, a classical information retrieval criterion, instead of the usual classification rate. Experiments illustrate the advantages of this approach with respect to the traditionnal 2- norm soft-margin SVM when precision and recall are of unequal importance. An automatic model selection procedure based on the generalization Fβ score is introduced. It relies on the results of Chapelle, Vapnik et al. (4) about the use of gradient-based techniques in SVM model selection. The derivatives of a Fβ loss function with respect to the hyperparameters C and the width σ of a gaussian kernel are formally defined. The model is then selected by performing a gradient descent of the Fβ loss function over the set of hyperparameters. Experiments on artificial and real-life data show the benefits of this method when the Fβ score is considered. I. INTRODUCTION Support Vector Machines (SVM) introduced by Vapnik (18) have been widely used in the field of pattern recognition for the last decade. The popularity of the method relies on its strong theoretical foundations as well as on its practical results. Performance of classifiers is usually assessed by means of classification error rate or by Information Retrieval (IR) measures such as precision, recall, Fβ , breakeven-point and ROC curves. Unfortunately, there is no direct connection between these IR criteria and the SVM hyperparameters: the regularization constant C and the kernel parameters. In this paper, we propose a novel method allowing the user to specify his requirement in terms of the Fβ criterion. First of all, the Fβ measure is reviewed as a user specification criterion in section II. A new SVM parametrization dealing with the β parameter is introduced in section III. Afterwards, a procedure for automatic model selection according to Fβ is proposed in section IV. This procedure is a gradient-based technique derived from the results of Chapelle, Vapnik et al. (4). Finally, experiments with artifical and real-life data are presented in section V. The two previous measures can be combined in a unique Fβ measure in which the paramater β specifies the relative importance of recall with respect to precision. Setting β equals to 0 would only consider precision whereas taking β = ∞ would only take recall into account. Moreover, precision and recall are of equal importance when using the F1 measure. The contingency matrix and estimations of precision, recall and Fβ are given hereafter. Target: +1 Target: -1 +1 True Pos. (# TP ) False Pos. (# FP ) -1 False Neg. (# FN ) True Neg. (# TN )

Pierre Dupont | Jérôme Callut | P. Dupont | J. Callut