Lookbehind Optimizer: k steps back, 1 step forward
暂无分享,去创建一个
[1] Hoki Kim,et al. Exploring the Effect of Multi-step Ascent in Sharpness-Aware Minimization , 2023, arXiv.org.
[2] Nicolas Flammarion,et al. Towards Understanding Sharpness-Aware Minimization , 2022, ICML.
[3] Timothy M. Hospedales,et al. Fisher SAM: Information Geometry and Sharpness Aware Minimisation , 2022, ICML.
[4] Joey Tianyi Zhou,et al. Sharpness-Aware Training for Free , 2022, NeurIPS.
[5] Gonçalo Mordido,et al. MemSE: Fast MSE Prediction for Noisy Memristor-Based DNN Accelerators , 2022, 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS).
[6] Hartwig Adam,et al. Surrogate Gap Minimization Improves Sharpness-Aware Training , 2022, ICLR.
[7] Cho-Jui Hsieh,et al. Towards Efficient and Scalable Sharpness-Aware Minimization , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Muhao Chen,et al. Sharpness-Aware Minimization with Dynamic Reweighting , 2021, EMNLP.
[9] Sanket Vaibhav Mehta,et al. An Empirical Investigation of the Role of Pre-training in Lifelong Learning , 2021, J. Mach. Learn. Res..
[10] Bohan Zhuang,et al. Sharpness-aware Quantization for Deep Neural Networks , 2021, ArXiv.
[11] Jeff Z. HaoChen,et al. Self-supervised Learning is More Robust to Dataset Imbalance , 2021, ICLR.
[12] Joey Tianyi Zhou,et al. Efficient Sharpness-aware Minimization for Improved Training of Neural Networks , 2021, ICLR.
[13] Pritish Narayanan,et al. Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices , 2021, Frontiers in Computational Neuroscience.
[14] B. Schiele,et al. Relating Adversarially Robust Generalization to Flat Minima , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[15] Jungmin Kwon,et al. ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks , 2021, ICML.
[16] Ariel Kleiner,et al. Sharpness-Aware Minimization for Efficiently Improving Generalization , 2020, ICLR.
[17] Geoffrey E. Hinton,et al. Lookahead Optimizer: k steps forward, 1 step back , 2019, NeurIPS.
[18] Evangelos Eleftheriou,et al. Accurate deep neural network inference using computational phase-change memory , 2019, Nature Communications.
[19] Marc'Aurelio Ranzato,et al. On Tiny Episodic Memories in Continual Learning , 2019 .
[20] Andrew Gordon Wilson,et al. Averaging Weights Leads to Wider Optima and Better Generalization , 2018, UAI.
[21] Philip H. S. Torr,et al. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , 2018, ECCV.
[22] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[23] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[24] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[25] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[26] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[27] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[28] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[29] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[31] Norman P. Jouppi,et al. Understanding the trade-offs in multi-level cell ReRAM memory design , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[32] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[33] Cho-Jui Hsieh,et al. Random Sharpness-Aware Minimization , 2022, NeurIPS.
[34] Sarath Chandar,et al. Sharpness-Aware Training for Accurate Inference on Noisy DNN Accelerators , 2022, ArXiv.
[35] Gunshi Gupta,et al. Look-ahead Meta Learning for Continual Learning , 2020, NeurIPS.
[36] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[37] Jürgen Schmidhuber,et al. Simplifying Neural Nets by Discovering Flat Minima , 1994, NIPS.