论文信息 - J an 2 01 1 MM Algorithms for Minimizing Nonsmoothly Penalized Objective Functions

J an 2 01 1 MM Algorithms for Minimizing Nonsmoothly Penalized Objective Functions

The use of regularization, or penalization, has become incr easingly common in highdimensional statistical analysis over the past decade, whe re a common goal is to simultaneously select important variables and estimate their e ffects. It has been shown by several authors that these goals can be achieved by minimizing some parameter-depende nt “goodness of fit” function (e.g., a negative loglikelihood) subject to a penalization that pr omotes sparsity. Penalty functions that are nonsmooth (i.e. not di fferentiable) at the origin have received substantial attent ion, arguably beginning with LASSO (Tibshirani, 1996). The current literature tends to focus on specific combinatio s f smooth data fidelity (i.e., goodness-of-fit) and nonsmooth penalty functions. One resu lt of this combined specificity has been a proliferation in the number of computational algorithms d e igned to solve fairly narrow classes of optimization problems involving objective functions that are not everywhere continuously di fferentiable. In this paper, we propose a general class of algorith ms for optimizing an extensive variety of nonsmoothly penalized objective functions that satisfy ce rtain regularity conditions. The proposed framework utilizes the majorization-minimization (MM) al gorithm as its core optimization engine. The resulting algorithms rely on iterated soft-thresholdi ng, implemented componentwise, allowing for fast, stable updating that avoids the need for any high-d imensional matrix inversion. We establish a local convergence theory for this class of algorithms unde r weaker assumptions than previously considered in the statistical literature. We also demonstr ate he exceptional e ffectiveness of new acceleration methods, originally proposed for the EM algorit hm, in this class of problems. Simulation results and a microarray data example are provided to demons trate the algorithm’s capabilities and versatility.

R. Strawderman | M. Wells | E. Schifano

[1] Haifen Li,et al. Induced smoothing for the semiparametric accelerated hazards model , 2012, Comput. Stat. Data Anal..

[2] I. Lossos,et al. Transformation of follicular lymphoma. , 2011, Best practice & research. Clinical haematology.

[3] I. Gijbels,et al. Penalized likelihood regression for generalized linear models with non-quadratic penalties , 2011 .

[4] Trevor Hastie,et al. Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[5] Robert Tibshirani,et al. Survival analysis with high-dimensional covariates , 2010, Statistical methods in medical research.

[6] R. Strawderman,et al. Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. , 2009, Biometrika.

[7] Hao Helen Zhang,et al. ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS. , 2009, Annals of statistics.

[8] Insuk Sohn,et al. Gradient lasso for Cox proportional hazards model , 2009, Bioinform..

[9] Yi Li,et al. Statistical Applications in Genetics and Molecular Biology Survival Analysis with High-Dimensional Covariates : An Application in Microarray Studies , 2011 .

[10] Lorenzo Rosasco,et al. Elastic-net regularization in learning theory , 2008, J. Complex..

[11] Robert J Tibshirani,et al. Statistical Applications in Genetics and Molecular Biology , 2011 .