Robust multi-stage method (MM) and least median square (LMS) evaluation on handling outlier for multiple regression

Outliers are observation with extreme values that are very different from most other data. The least squares method (OLS) is a method that is very sensitive to outliers because it can cause the modeling accuracy decrease. Throwing out outliers with the aim of improving the suitability of the regression equation cannot be done carelessly because it will provide imprecise estimation precision. This study examines the performance of robust method of the least median squares (LMS) and the multi-stage method (MM) compared to OLS in a regression analysis of data which contains outliers. Data analysis was performed on simulation data and oil palm production data. Based on the average parameter estimate bias value, MM method has the best performance in each scenario condition of data size and outlier percentage, while based on the average root mean squares error (RMSE), LMS has better performance than MM when the data size is 25. Analysis of Indonesian oil palm production data in 2018 which data size 25 and contains 44% outliers resulted the conclusion that LMS method produced smallest RMSE and highest R2, namely 38.81 and 99.78%, respectively. MM method is in the second best position, while OLS produces largest RMSE and highest R2.