Performance evaluations of supervised learners on imbalanced datasets

The distributions of classes in a dataset might be unbalanced. Samples of each class might lie unevenly in the feature space. Such datasets frequently can be seen in real life. In this study, the classification performance of supervised learners over skewed datasets has been analyzed. Decision Trees, k nearest neighbors, Naïve Bayes, Support Vector Machines and Logistic Regression Model are used in the practical applications. The most successful classifiers on skewed datasets are respectively Logistic Regression Model, Naïve Bayes and Decision Tree algorithms.