Variance Function Estimation in High-dimensions

We consider the high-dimensional heteroscedastic regression model, where the mean and the log variance are modeled as a linear combination of input variables. Existing literature on high-dimensional linear regression models has largely ignored non-constant error variances, even though they commonly occur in a variety of applications ranging from biostatistics to finance. In this paper we study a class of nonconvex penalized pseudolikelihood estimators for both the mean and variance parameters. We show that the Heteroscedastic Iterative Penalized Pseudolikelihood Optimizer (HIPPO) achieves the oracle property, that is, we prove that the rates of convergence are the same as if the true model was known. We demonstrate numerical properties of the procedure on a simulation study and real world data.

[1]  Bin Yu,et al.  The Lasso under Heteroscedasticity , 2010, 1011.1026.

[2]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[3]  Sham M. Kakade,et al.  Dimension-free tail inequalities for sums of random matrices , 2011, ArXiv.

[4]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.

[5]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[6]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[7]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[8]  Wayne A. Fuller,et al.  Least Squares Estimation When the Covariance Matrix and Parameter Vector are Functionally Related , 1980 .

[9]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[10]  Jianqing Fan,et al.  Variance estimation using refitted cross‐validation in ultrahigh dimensional regression , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[11]  Jianqing Fan,et al.  Nonconcave Penalized Likelihood With NP-Dimensionality , 2009, IEEE Transactions on Information Theory.

[12]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[13]  H. Dette,et al.  The adaptive lasso in high-dimensional sparse heteroscedastic models , 2013 .

[14]  Tong Zhang Some sharp performance bounds for least squares regression with L1 regularization , 2009, 0908.2869.

[15]  Cun-Hui Zhang,et al.  Scaled sparse linear regression , 2011, 1104.4595.

[16]  Hongzhe Li,et al.  High‐Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis , 2012, Biometrics.

[17]  A. Ziegler,et al.  Generalized Estimating Equations , 2010, Methods of Information in Medicine.

[18]  F. Bach,et al.  Optimization with Sparsity-Inducing Penalties (Foundations and Trends(R) in Machine Learning) , 2011 .

[19]  A. Harvey Estimating Regression Models with Multiplicative Heteroscedasticity , 1976 .

[20]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[21]  Jinchi Lv,et al.  A unified approach to model selection and sparse recovery using regularized least squares , 2009, 0905.3573.

[22]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[23]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[24]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .