KOMBINASI LOGISTIC REGRESSION DAN GRADIENT BOOST TREE UNTUK MENDETEKSI EMAIL SPAM

Email spam is a serious problem that is experienced by users all over the world. In 2016, it was noted that 61.66% of spam hampered the flow of world traffic. The average email received by the recipient in the form of spam containing an ad, so the need for filtering spam in order to minimize receipt of spam to the recipient. Spam filtering can be done by spam classification process which is a problem in email. In this study, the classification process is carried out using the Gradient Boost Tree method. However, this method can experience overfitting if the data used is noise. So for the data classification process, using the Gradient Boost Tree algorithm that is optimized using Logistic Regression. This study compares the results of the classification using the spambase dataset on the Gradient Boost Tree algorithm and with the addition of Logistic Regression to the feature selection process. From this study, obtained the highest accuracy results in the merging of the Gradient Boost Tree algorithm that is optimized with Logistic Regression that is equal to 95.13%