Application of data mining to candidate screening

Classification models are supervised learning methods used for predicting the value of a categorical target attribute. These models use a set of examples, called the training set, to learn to predict the target class of a future example whose class is unknown. The development of learning algorithms capable of learning from past experience is an important step in emulating inductive learning in humans. Classification finds application in many domains, of which selection of customers for a marketing campaign, fraud detection, diagnosis of diseases, image recognition, and spam e-mail filtering are just a few examples. In this paper, we present a comparative study of the application of the Naive Bayes and k-Nearest Neighbors (kNN) classification methods to the problem of screening candidates for a vacant position in an organization. The observable attributes of a candidate profile are first established. In the training phase, a training set of example profiles is used for learning the classification rules of the organization. In the test phase, the accuracy of the classification model is assessed by classifying example profiles not included in the training set, but for which the target class is already known. In the prediction phase, a profile whose target class is not known is classified. We present the results of the comparison of the Naive Bayes method with the kNN method.