暂无分享,去创建一个
Marco Canini | Arvind Krishnamurthy | Yibo Zhu | Chuanxiong Guo | Tianyi Zhou | Yuchen Jin | Liangyu Zhao | Tianyi Zhou | A. Krishnamurthy | Yibo Zhu | Chuanxiong Guo | Liangyu Zhao | Yuchen Jin | Marco Canini
[1] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[2] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[3] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[4] Paolo Frasconi,et al. Marthe: Scheduling the Learning Rate Via Online Hypergradients , 2019, IJCAI.
[5] Aaron Klein,et al. Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.
[6] Ameet Talwalkar,et al. Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.
[7] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[8] Ameet Talwalkar,et al. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..
[9] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[10] Ameet Talwalkar,et al. A System for Massively Parallel Hyperparameter Tuning , 2020, MLSys.
[11] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[12] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[13] S. Shankar Sastry,et al. Step Size Matters in Deep Learning , 2018, NeurIPS.
[14] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[15] David D. Cox,et al. Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.
[16] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[17] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[18] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[19] Richard Socher,et al. A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation , 2018, ICLR.
[20] Dustin Tran,et al. Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches , 2018, ICLR.
[21] Leslie N. Smith,et al. A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay , 2018, ArXiv.
[22] Thibault Langlois,et al. Parameter adaptation in stochastic optimization , 1999 .
[23] D. Dennis,et al. A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.
[24] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.
[25] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[26] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Paolo Frasconi,et al. Forward and Reverse Gradient-Based Hyperparameter Optimization , 2017, ICML.
[29] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[32] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[33] Leslie N. Smith,et al. Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[34] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[35] Philipp Hennig,et al. Probabilistic Line Searches for Stochastic Optimization , 2015, NIPS.
[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[37] Aaron Klein,et al. BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.
[38] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
[39] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[40] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[41] Marc G. Genton,et al. Classes of Kernels for Machine Learning: A Statistics Perspective , 2002, J. Mach. Learn. Res..
[42] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[43] Jasper Snoek,et al. Freeze-Thaw Bayesian Optimization , 2014, ArXiv.
[44] Mark W. Schmidt,et al. Online Learning Rate Adaptation with Hypergradient Descent , 2017, ICLR.
[45] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[46] Renjie Liao,et al. Understanding Short-Horizon Bias in Stochastic Meta-Optimization , 2018, ICLR.
[47] Kian Hsiang Low,et al. Bayesian Optimization Meets Bayesian Optimal Stopping , 2019, ICML.
[48] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[49] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[50] Frank Hutter,et al. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.
[51] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[52] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[53] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[54] Stephen Roberts,et al. Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits , 2020, NeurIPS.
[55] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[56] Aaron Klein,et al. Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search , 2018, ArXiv.
[57] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.