暂无分享,去创建一个
Cho-Jui Hsieh | Minhao Cheng | Xiangning Chen | Yang You | Yong Liu | Cho-Jui Hsieh | Yang You | Minhao Cheng | Yong Liu | Xiangning Chen
[1] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.
[2] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[3] Quoc V. Le,et al. Don't Decay the Learning Rate, Increase the Batch Size , 2017, ICLR.
[4] Nicolas Flammarion,et al. Understanding and Improving Fast Adversarial Training , 2020, NeurIPS.
[5] Yuanzhou Yang,et al. Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes , 2018, ArXiv.
[6] Jaeho Lee,et al. Minimax Statistical Learning with Wasserstein distances , 2017, NeurIPS.
[7] Seyed-Mohsen Moosavi-Dezfooli,et al. Robustness via Curvature Regularization, and Vice Versa , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Mehmet Deveci,et al. Exploring the limits of Concurrency in ML Training on Google TPUs , 2020, ArXiv.
[9] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[10] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.
[11] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[12] Jascha Sohl-Dickstein,et al. Measuring the Effects of Data Parallelism on Neural Network Training , 2018, J. Mach. Learn. Res..
[13] Takuya Akiba,et al. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.
[14] Quoc V. Le,et al. Adversarial Examples Improve Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[17] Uri Shaham,et al. Understanding adversarial training: Increasing local stability of supervised models through robust optimization , 2015, Neurocomputing.
[18] Kurt Keutzer,et al. Large batch size training of neural networks with adversarial training and second-order information , 2018, ArXiv.
[19] Forrest N. Iandola,et al. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Michael Garland,et al. AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks , 2017, ArXiv.
[21] James Demmel,et al. Large-batch training for LSTM and beyond , 2019, SC.
[22] Tao Wang,et al. Image Classification at Supercomputer Scale , 2018, ArXiv.
[23] James Bailey,et al. On the Convergence and Robustness of Adversarial Training , 2021, ICML.
[24] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[25] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[26] John C. Duchi,et al. Certifiable Distributional Robustness with Principled Adversarial Training , 2017, ArXiv.
[27] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[28] Cho-Jui Hsieh,et al. Robust and Accurate Object Detection via Adversarial Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Larry S. Davis,et al. Adversarial Training for Free! , 2019, NeurIPS.
[30] Satoshi Matsuoka,et al. Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs , 2018, ArXiv.
[31] Masafumi Yamazaki,et al. Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.7 seconds , 2019, ArXiv.
[32] Kurt Keutzer,et al. Hessian-based Analysis of Large Batch Training and Robustness to Adversaries , 2018, NeurIPS.
[33] J. Zico Kolter,et al. Fast is better than free: Revisiting adversarial training , 2020, ICLR.
[34] Mu Li. Proposal Scaling Distributed Machine Learning with System and Algorithm Co-design , 2016 .
[35] Yang You,et al. Scaling SGD Batch Size to 32K for ImageNet Training , 2017, ArXiv.
[36] Elad Hoffer,et al. Train longer, generalize better: closing the generalization gap in large batch training of neural networks , 2017, NIPS.
[37] Bin Dong,et al. You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle , 2019, NeurIPS.
[38] Iasonas Kokkinos,et al. MultiGrain: a unified image embedding for classes and instances , 2019, ArXiv.
[39] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[40] Vikram A. Saletore,et al. Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train , 2017, ArXiv.
[41] Jiliang Tang,et al. Characterizing the Decision Boundary of Deep Neural Networks , 2019, ArXiv.
[42] Quoc V. Le,et al. AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).