K-Fold Generalization Capability Assessment for

Theproblem ofhowtoeffectively implement k-fold cross-validation forSupport Vector Machines ishereconsidered. Indeed, despite thefact that this selection criterion iswidely used duetoitsreasonable requirements intermsofcomputational resources anditsgoodability inidentifying awellperforming model, itisnotclear howoneshould employ thecommittee ofclassifiers comingfromthek folds forthetaskofon-ine classification. Threemethods areheredescribed andtested, based respectively on:averaging, randomchoice andmajority voting. Eachofthese methods istested onawiderange ofdata-sets for different fold settings. I.INTRODUCTION K-fold Cross-Validation (KCV)isoneofthemostadopted criteria forassessing theperformance ofa modelandfor selecting anhypothesis within aclass. Anadvantage ofthis method, overthesimple training-test datasplitting, isthe repeated useofthewholeavailable dataforbothbuilding a learning machine andfortesting it, thusreducing therisk of (un)lucky splitting. Despite thefact that KCVdoesnothaveadedicated theory whichguarantees ad-hoc bounds onthegeneralization error andthat thevariance oftheestimated trueerror ishardto assess (1), this method iscurrently widely usedinmanyfields fordifferent problem types andperforms like thebest available selection criteria while requiring amoderate computational overhead (6).