Comparison of Naive Bayes and K-NN Method on Tuition Fee Payment Overdue Prediction

Attending basic education is an obligation for all Indonesian citizens. The financial cost is one of the input components to implement an education or even can be considered as the main requirement in achieving the goal of education. For a private education institution in Indonesia, the financial cost is mainly covered by students’ tuition payments. SMK Al-Islam Surakarta is a private school that manages all its students to pay school tuition fees monthly. According to its last year’s administrative report, the number of students who are late in paying school tuition fee is around 60%. Since the school’s operational costs are heavily depended on their income from tuition fees, this considered an essential problem and has to be managed and predicted as well. This research will discuss techniques in predicting the late payment of tuition fees. From many popular methods available in this area, we observed two of them namely Naive Bayes and K-Nearest Neighbor (K-NN). This study will compare the accuracy between those two methods. The data used for the lab work is the official education basic data of Al-Islam Surakarta Vocational School in 2017/2018 totaling 236 data. To increase its accuracy, this study also combines the prediction methods with feature selection technique Information Gain which is commonly used to select an optimal parameter for the prediction process. In the end, the system is tested using the Confusion Matrix method. The results showed that the Naive Bayes Method with Information Gain attribute selection produced the highest accuracy of 69%.

[1]  Muhammad Ali Ramdhani,et al.  Implementation of Nearest Neighbor using HSV to Identify Skin Disease , 2018 .

[2]  Kusrini,et al.  Algoritma Data Mining , 2009 .

[3]  Adiwijaya,et al.  On the Feature Selection and Classification Based on Information Gain for Document Sentiment Analysis , 2018, Appl. Comput. Intell. Soft Comput..

[4]  Maryam Hasan PREDIKSI TINGKAT KELANCARAN PEMBAYARAN KREDIT BANK MENGGUNAKAN ALGORITMA NAÏVE BAYES BERBASIS FORWARD SELECTION , 2017 .

[5]  Bouikhalene Belaid,et al.  Clustering Prediction Techniques in Defining and Predicting Customers Defection: The Case of E-Commerce Context , 2018, International Journal of Electrical and Computer Engineering (IJECE).

[6]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[7]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[8]  Ahmed I. Saleh,et al.  Gene expression cancer classification using modified K-Nearest Neighbors technique , 2019, Biosyst..

[9]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[10]  Sudarson Jena,et al.  Efficient Feature Subset Selection Algorithm for High Dimensional Data , 2016 .

[11]  Mi Li,et al.  Emotion recognition from multichannel EEG signals using K-nearest neighbor classification , 2018, Technology and health care : official journal of the European Society for Engineering and Medicine.

[12]  Mujiono Sadikin,et al.  Comparative Study of Classification Method on Customer Candidate Data to Predict its Potential Risk , 2018, International Journal of Electrical and Computer Engineering (IJECE).

[13]  Ahmed Moustafa,et al.  Information Gain as a Feature Selection Method for the Efficient Classification of Influenza Based on Viral Hosts , 2022 .

[14]  Zubair Ahmad Khan,et al.  Diagnosis of Diabetes Mellitus using K Nearest Neighbor Algorithm , 2014 .