Efficient adaptive regression spline algorithms based on mapping approach with a case study on finance

Multivariate adaptive regression splines (MARS) has become a popular data mining (DM) tool due to its flexible model building strategy for high dimensional data. Compared to well-known others, it performs better in many areas such as finance, informatics, technology and science. Many studies have been conducted on improving its performance. For this purpose, an alternative backward stepwise algorithm is proposed through Conic-MARS (CMARS) method which uses a penalized residual sum of squares for MARS as a Tikhonov regularization problem. Additionally, by modifying the forward step of MARS via mapping approach, a time efficient procedure has been introduced by S-FMARS. Inspiring from the advantages of MARS, CMARS and S-FMARS, two hybrid methods are proposed in this study, aiming to produce time efficient DM tools without degrading their performances especially for large datasets. The resulting methods, called SMARS and SCMARS, are tested in terms of several performance criteria such as accuracy, complexity, stability and robustness via simulated and real life datasets. As a DM application, the hybrid methods are also applied to an important field of finance for predicting interest rates offered by a Turkish bank to its customers. The results show that the proposed hybrid methods, being the most time efficient with competing performances, can be considered as powerful choices particularly for large datasets.

[1]  T. Ekman,et al.  Nonlinear prediction of mobile radio channels: measurements and MARS model designs , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[3]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[4]  Yuming Zhou,et al.  Predicting object-oriented software maintainability using multivariate adaptive regression splines , 2007, J. Syst. Softw..

[5]  Young K. Truong,et al.  Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture , 1997 .

[6]  Peter C Austin,et al.  A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality , 2007, Statistics in medicine.

[7]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[8]  Patricia L. Smith,et al.  Curve fitting and modeling with splines using statistical variable selection techniques , 1982 .

[9]  Young K. Truong,et al.  Polynomial splines and their tensor products in extended linearmodeling , 1997 .

[10]  G. Kubin,et al.  A multi-band nonlinear oscillator model for speech , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[11]  Gerhard-Wilhelm Weber,et al.  EVALUATING THE CMARS PERFORMANCE FOR MODELING NON‐LINEARITIES , 2010 .

[12]  Gints Jekabsons,et al.  Adaptive Regression Splines toolbox for Matlab/Octave , 2015 .

[13]  R. Kass,et al.  Bayesian curve-fitting with free-knot splines , 2001 .

[14]  Cem Iyigun,et al.  Restructuring forward step of MARS algorithm using a new knot selection procedure based on a mapping approach , 2014, J. Glob. Optim..

[15]  G. Wahba,et al.  Hybrid Adaptive Splines , 1997 .

[16]  J. Friedman Multivariate adaptive regression splines , 1990 .

[17]  D. Coomans,et al.  Exploration of linear modelling techniques and their combination with multivariate adaptive regression splines to predict gastro-intestinal absorption of drugs. , 2007, Journal of pharmaceutical and biomedical analysis.

[18]  G. Weber,et al.  RCMARS: Robustification of CMARS with different scenarios under polyhedral uncertainty set , 2011 .

[19]  P. Taylan,et al.  New approaches to regression by generalized additive models and continuous optimization for modern applications in finance, science and technology , 2007 .

[20]  Adrian F. M. Smith,et al.  Automatic Bayesian curve fitting , 1998 .

[21]  Xiaotong Shen,et al.  Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes , 2001 .

[22]  Yuehjen E. Shao,et al.  Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines , 2004, Expert Syst. Appl..

[23]  G. Weber,et al.  CMARS: a new contribution to nonparametric regression with multivariate adaptive regression splines supported by continuous optimization , 2012 .

[24]  J. Friedman,et al.  FLEXIBLE PARSIMONIOUS SMOOTHING AND ADDITIVE MODELING , 1989 .