Support Vector-Quantile Regression Random Forest Hybrid for Regression Problems

In this paper we propose a novel support vector based soft computing technique which can be applied to solve regression problems. Proposed hybrid outperforms previously known techniques in literature in terms of accuracy of prediction and time taken for training. We also present a comparative study of quantile regression, differential evolution trained wavelet neural networks (DEWNN) and quantile regression random forest ensemble models in prediction in regression problems. Intervals of the parameter values of random forest for which the performance figures of the Quantile Regression Random Forest (QRFF) are statistically stable are also identified. The effectiveness of the QRFF over Quantile Regression and DWENN is evaluated on Auto MPG dataset, Body fat dataset, Boston Housing dataset, Forest Fires dataset, Pollution dataset, by using 10-fold cross validation.

[1]  Dipti Srinivasan,et al.  Energy demand prediction using GMDH networks , 2008, Neurocomputing.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Vadlamani Ravi,et al.  Rule Extraction from DEWNN to Solve Classification and Regression Problems , 2012, SEMCCO.

[4]  B. Schröder,et al.  Estimation of suspended sediment concentration and yield using linear models, random forests and quantile regression forests , 2008 .

[5]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[6]  J. Poterba,et al.  The Distribution of Public Sector Wage Premia: New Evidence Using Quantile Regression Methods , 1994 .

[7]  Vadlamani Ravi,et al.  Kernel Group Method of Data Handling: Application to Regression Problems , 2012, SEMCCO.

[8]  Moshe Buchinsky,et al.  The dynamics of changes in the female wage distribution in the USA: a quantile regression approach , 1998 .

[9]  R. Koenker,et al.  Computing regression quantiles , 1987 .

[10]  Stephen Portnoy,et al.  Bivariate quantile smoothing splines , 1998 .

[11]  Somnath Datta,et al.  Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens , 2004, Bioinform..

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Eric R. Eide,et al.  The effect of school quality on student performance: A quantile regression approach , 1998 .

[14]  Jesse Levin,et al.  For whom the reductions count: A quantile regression analysis of class size and peer effects on scholastic achievement , 2001 .

[15]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[16]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[17]  Vadlamani Ravi,et al.  Support vector regression based hybrid rule extraction methods for forecasting , 2010, Expert Syst. Appl..

[18]  Vadlamani Ravi,et al.  Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks , 2009, Expert Syst. Appl..

[19]  James L. Powell,et al.  Efficient Estimation of Linear and Type I Censored Regression Models Under Conditional Quantile Restrictions , 1990, Econometric Theory.

[20]  Robin Girard,et al.  Forecasting Uncertainty Related to Ramps of Wind Power Production , 2010 .

[21]  Vadlamani Ravi,et al.  Differential evolution trained kernel principal component WNN and kernel binary quantile regression: Application to banking , 2013, Knowl. Based Syst..

[22]  W. Loh,et al.  Nonparametric estimation of conditional quantiles using quantile regression trees ∗ ( Published in Bernoulli ( 2002 ) , 8 , 561 – 576 ) , 2008 .

[23]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[24]  Dirk Van den Poel,et al.  Binary quantile regression: a Bayesian approach based on the asymmetric Laplace distribution , 2012 .

[25]  R. Koenker,et al.  The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators , 1997 .

[26]  Zhihua Zhang,et al.  Matrix-Variate Dirichlet Process Mixture Models , 2010, AISTATS.

[27]  Beum-Jo Park,et al.  Quantile Regression Approach , 2003 .

[28]  Richard E. Mueller Public‐ and Private‐Sector Wage Differentials in Canada Revisited , 2000 .

[29]  John W. Tukey,et al.  Data Analysis and Regression: A Second Course in Statistics , 1977 .

[30]  W. Newey,et al.  Asymmetric Least Squares Estimation and Testing , 1987 .

[31]  Kevin F. Hallock,et al.  Individual heterogeneity in the returns to schooling: instrumental variables quantile regression using twins data , 1999 .

[32]  Moshe Buchinsky CHANGES IN THE U.S. WAGE STRUCTURE 1963-1987: APPLICATION OF QUANTILE REGRESSION , 1994 .

[33]  Pin T. Ng,et al.  Quantile smoothing splines , 1994 .

[34]  Niels Schulze Applied Quantile Regression: Microeconometric, Financial, and Environmental Analyses , 2004 .

[35]  Ian Barrodale,et al.  Algorithm 478: Solution of an Overdetermined System of Equations in the l1 Norm [F4] , 1974, Commun. ACM.

[36]  P. Cortez,et al.  A data mining approach to predict forest fires using meteorological data , 2007 .

[37]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[38]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[39]  D. Roy,et al.  What limits fire? An examination of drivers of burnt area in Southern Africa , 2009 .