Support Vector Machine Failure in Imbalanced Datasets

Imbalanced datasets often pose challenges in classification problems. In this work we study and quantify the problem of imbalanced classification using support vector machines (SVM). We identify the conditions under which a SVM failure occur, both theoretically and experimentally, and show that it can be relevant even in cases of very weakly imbalanced data. The guidelines for exploratory data analysis are presented to avoid the SVM failure.