Feature Selection Using Multivariate Adaptive Regression Splines in Telecommunication Fraud Detection

Feature selection determines the most significant features for a given task while rejecting the noisy, irrelevant and redundant features of the dataset that might mislead the classifier. Besides, the technique diminishes the dimensionality of the attribute of the dataset, thus reducing computation time and improving prediction performance. This paper aims to perform a feature selection for classification more accurately with an optimal features subset using Multivariate Adaptive Regression Splines (MARS) in Spline Model (SM) classifier. A comparative study of prediction performance was conducted with other classifiers including Decision Tree (DT), Neural Network (NN) and Support Vector Machine (SVM) with similar optimal feature subset produced by MARS. From the results, the MARS technique demonstrated the features reduction up to 87.76% and improved the classification accuracy. Based on the comparative analysis conducted, the Spline classifier shows better performance by achieving the highest accuracy (97.44%) compared to other classifiers.

[1]  Victor Amoako Temeng,et al.  Multivariate Adaptive Regression Splines (MARS) approach to blast-induced ground vibration prediction , 2020, International Journal of Mining, Reclamation and Environment.

[2]  Chaofeng Zeng,et al.  Evaluation of the earthquake induced uplift displacement of tunnels using multivariate adaptive regression splines , 2019, Computers and Geotechnics.

[3]  Ali Masoudi-Nejad,et al.  FeatureSelect: a software for feature selection based on machine learning approaches , 2019, BMC Bioinformatics.

[4]  Ria Dhea Layla Nur Karisma,et al.  Multivariate adaptive regression spline in Ischemic and Hemorrhagic patient (case study) , 2019 .

[5]  Véronique Van Vlasselaer,et al.  Fraud Analytics : Using Descriptive, Predictive, and Social Network Techniques:A Guide to Data Science for Fraud Detection , 2015 .

[6]  Sanjeev C. Lingareddy Fraud Detection Using Data Mining Techniques , 2014 .

[7]  Anca L. Ralescu,et al.  Confusion Matrix-based Feature Selection , 2011, MAICS.

[8]  Donald E. Brown,et al.  Global Optimization With Multivariate Adaptive Regression Splines , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Dominique Haughton,et al.  Application of multiple adaptive regression splines (MARS) in direct response modeling , 2002 .

[10]  Charles B. Roosen,et al.  An introduction to multivariate adaptive regression splines , 1995, Statistical methods in medical research.

[11]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .