Classification of Questions Based on Difficulty Levels using Support Vector Machine and Naïve Bayes Algorithms for Imbalanced Class

Quiz questions are crucial evaluations in measuring student learning development because they are one of the lecturers' benchmarks for providing learning materials. The accuracy of the results of measuring student competency achievement is important because it will be used as a benchmark for assessment by lecturers, therefore a question instrument that functions well is needed in distinguishing between students who have high abilities and students who have low abilities based on defined criteria. A good question, that is, when a question has a balanced level of difficulty (proportional), it is said that the question is good. However, a question should be neither too difficult nor too easy. On that basis, grouping the level of difficulty of the questions should be done to make a package of questions that fit the portion. The case study taken by the researcher is a Data Warehouse S1 Information System at Telkom University. The case study was taken because the Data Warehouse course is a compulsory subject in the Information Studies Program at Telkom University. In doing the classification, the writer compares the Naive Bayes algorithm and the Support Vector Machine. The comparison results obtained the highest accuracy with the algorithm method SVM Classification. The accuracy results were obtained from the comparison of the average scores on the algorithm Naïve Bayes (Before SMOTE) of 85.73% and the SVM algorithm (Before SMOTE) of 85.11%. then for the comparison of the average score on the algorithm Naïve Bayes (After SMOTE) of 88.9% and on the SVM algorithm (After SMOTE) of 97.82%.