Adaptive Methods for Nonconvex Optimization
暂无分享,去创建一个
Sanjiv Kumar | Manzil Zaheer | Sashank J. Reddi | Satyen Kale | Devendra Singh Sachan | M. Zaheer | Sanjiv Kumar | Satyen Kale
[1] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[2] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[3] H. Robbins. A Stochastic Approximation Method , 1951 .
[4] Matthew J. Streeter,et al. Adaptive Bound Optimization for Online Convex Optimization , 2010, COLT 2010.
[5] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[6] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[7] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[8] Daniel Povey,et al. Krylov Subspace Descent for Deep Learning , 2011, AISTATS.
[9] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[10] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[11] Christopher D. Manning,et al. Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.
[12] Shai Shalev-Shwartz,et al. Beyond Convexity: Stochastic Quasi-Convex Optimization , 2015, NIPS.
[13] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[16] Eric Nichols,et al. Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.
[17] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Alexander J. Smola,et al. Fast Stochastic Methods for Nonsmooth Nonconvex Optimization , 2016, ArXiv.
[20] Alexander J. Smola,et al. Fast Incremental Method for Nonconvex Optimization , 2016, ArXiv.
[21] Yair Carmon,et al. Accelerated Methods for Non-Convex Optimization , 2016, SIAM J. Optim..
[22] Saeed Ghadimi,et al. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization , 2013, Mathematical Programming.
[23] Eduard H. Hovy,et al. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.
[24] Zeyuan Allen Zhu,et al. Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.
[25] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.
[26] Tengyu Ma,et al. Finding Approximate Local Minima for Nonconvex Optimization in Linear Time , 2016, ArXiv.
[27] Tuo Zhao,et al. Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning , 2016, ICML.
[28] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[29] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Richard Socher,et al. Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.
[31] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[32] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[33] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[34] Alexander J. Smola,et al. A Generic Approach for Escaping Saddle points , 2017, AISTATS.
[35] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.
[36] Yu Zhang,et al. Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning , 2018, bioRxiv.