暂无分享,去创建一个
[1] A. Tikhonov. On the stability of inverse problems , 1943 .
[2] Mario Bertero,et al. The Stability of Inverse Problems , 1980 .
[3] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..
[4] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[5] Wenjiang J. Fu. Penalized Regressions: The Bridge versus the Lasso , 1998 .
[6] Bin Yu,et al. High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.
[7] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[8] Cheolwoo Park,et al. Bridge regression: Adaptivity and group selection , 2011 .
[9] Zongben Xu,et al. Fast image deconvolution using closed-form thresholding formulas of Lq ð q 1⁄4 12 ; 23 Þ regularization , 2012 .
[10] Michael A. Saunders,et al. Proximal Newton-type methods for convex optimization , 2012, NIPS.
[11] Zongben Xu,et al. Fast image deconvolution using closed-form thresholding formulas of regularization , 2013, J. Vis. Commun. Image Represent..
[12] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[13] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[16] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[19] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[20] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.
[21] Alexander J. Smola,et al. Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization , 2016, NIPS.
[22] Zeyuan Allen Zhu,et al. Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter , 2017, ArXiv.
[23] Eunho Yang,et al. Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity , 2017, ICML.
[24] Sashank J. Reddi,et al. On the Convergence of Adam and Beyond , 2018, ICLR.
[25] SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms , 2018, 1810.10690.
[26] Max Welling,et al. Learning Sparse Neural Networks through L0 Regularization , 2017, ICLR.
[27] Yi Zhou,et al. SpiderBoost: A Class of Faster Variance-reduced Algorithms for Nonconvex Optimization , 2018, ArXiv.
[28] Sanjiv Kumar,et al. Adaptive Methods for Nonconvex Optimization , 2018, NeurIPS.
[29] Tianbao Yang,et al. Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems , 2019, NeurIPS.
[30] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[31] Stephen Becker,et al. On Quasi-Newton Forward-Backward Splitting: Proximal Calculus and Convergence , 2018, SIAM J. Optim..
[32] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[33] Dmitriy Drusvyatskiy,et al. Stochastic model-based minimization of weakly convex functions , 2018, SIAM J. Optim..
[34] Yu Bai,et al. ProxQuant: Quantized Neural Networks via Proximal Operators , 2018, ICLR.
[35] Rong Jin,et al. Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence , 2018, ICML.
[36] Eunho Yang,et al. Trimming the $\ell_1$ Regularizer: Statistical Analysis, Optimization, and Applications to Deep Learning , 2019, ICML.
[37] Eunho Yang,et al. Stochastic Gradient Methods with Block Diagonal Matrix Adaptation , 2019, ICLR 2019.
[38] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[39] Mingyi Hong,et al. On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization , 2018, ICLR.
[40] Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks , 2019, NeurIPS.
[41] Guodong Zhang,et al. Three Mechanisms of Weight Decay Regularization , 2018, ICLR.
[42] Lam M. Nguyen,et al. ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization , 2019, J. Mach. Learn. Res..
[43] Ke Tang,et al. Stochastic Gradient Descent for Nonconvex Learning Without Bounded Gradient Assumptions , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[44] Zhihui Zhu,et al. Orthant Based Proximal Stochastic Gradient Method for 𝓁1-Regularized Optimization , 2020, ECML/PKDD.
[45] Symeon Chatzinotas,et al. ProxSGD: Training Structured Neural Networks under Regularization and Constraints , 2020, ICLR.