VIF Regression: A Fast Regression Algorithm for Large Data

We propose a fast regression algorithm that can substantially reduce the computational complexity of searching, yet retain good accuracy. It also guarantees to discover correlated features that are collectively predictive, and avoid model over-fitting. Its capability of controlling mFDR (marginal False Discovery Rate) statistically enables the one-pass search of the fast algorithm and guarantees the accuracy of the sparse model chosen by the algorithm without cross validation. Numerical results show that our algorithm is much faster than any other algorithm and is competitively as accurate as the best but slower algorithms.

[1]  S. Konishi An approximation to the distribution of the sample correlation coefficient , 1978 .

[2]  D. Freedman,et al.  How Many Variables Should Be Entered in a Regression Equation , 1983 .

[3]  A. Atkinson Subset Selection in Regression , 1992 .

[4]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[5]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[6]  Michael P. Jones Indicator and stratification methods for missing explanatory variables in multiple linear regression , 1996 .

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[9]  Dean P. Foster,et al.  Variable Selection in Data Mining , 2004 .

[10]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[11]  John D. Storey A direct approach to false discovery rates , 2002 .

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[14]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[15]  Y. Ritov,et al.  Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[16]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[17]  Avishai Mandelbaum,et al.  Statistical Analysis of a Telephone Call Center , 2005 .

[18]  Jing Zhou,et al.  Streamwise Feature Selection , 2006, J. Mach. Learn. Res..

[19]  Jonathan Weinberg,et al.  Bayesian Forecasting of an Inhomogeneous Poisson Process With Applications to Call Center Data , 2007 .

[20]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[21]  Tong Zhang,et al.  Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models , 2008, NIPS.

[22]  Dean P. Foster,et al.  α‐investing: a procedure for sequential control of expected false discoveries , 2008 .

[23]  Hansheng Wang Forward Regression for Ultra-High Dimensional Variable Screening , 2009 .

[24]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[25]  Lie Wang,et al.  Orthogonal Matching Pursuit for Sparse Signal Recovery , 2010 .

[26]  Dean P. Foster,et al.  VIF Regression: A Fast Regression Algorithm for Large Data , 2011 .