暂无分享,去创建一个
[1] Hakan Bilen,et al. Mode Normalization , 2018, ICLR.
[2] Nicolas Bonnotte. Unidimensional and Evolution Methods for Optimal Transportation , 2013 .
[3] Zhe Gan,et al. Improving Sequence-to-Sequence Learning via Optimal Transport , 2019, ICLR.
[4] Lawrence Carin,et al. Policy Optimization as Wasserstein Gradient Flows , 2018, ICML.
[5] Lei Huang,et al. Decorrelated Batch Normalization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[6] David Rolnick,et al. Measuring and regularizing networks in function space , 2018, ICLR.
[7] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[8] Hossein Mobahi,et al. Learning with a Wasserstein Loss , 2015, NIPS.
[9] Andrea Vedaldi,et al. Universal representations: The missing link between faces, text, planktons, and cat breeds , 2017, ArXiv.
[10] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[11] Thomas Hofmann,et al. Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization , 2018, AISTATS.
[12] Richard Socher,et al. Revisiting Activation Regularization for Language RNNs , 2017, ArXiv.
[13] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[14] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[15] Carla P. Gomes,et al. Understanding Batch Normalization , 2018, NeurIPS.
[16] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Kaiming He,et al. Group Normalization , 2018, ECCV.
[18] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[19] Razvan Pascanu,et al. Natural Neural Networks , 2015, NIPS.
[20] Julien Rabin,et al. Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.
[21] Filippo Santambrogio,et al. Optimal Transport for Applied Mathematicians , 2015 .
[22] Mubarak Shah,et al. Training Faster by Separating Modes of Variation in Batch-Normalized Models , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Gustavo K. Rohde,et al. Sliced Wasserstein Auto-Encoders , 2018, ICLR.
[24] F. Santambrogio. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .
[25] Nicol N. Schraudolph,et al. Accelerated Gradient Descent by Factor-Centering Decomposition , 1998 .
[26] Quoc V. Le,et al. DropBlock: A regularization method for convolutional networks , 2018, NeurIPS.
[27] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[28] Thomas Hofmann,et al. Towards a Theoretical Understanding of Batch Normalization , 2018, ArXiv.
[29] Jascha Sohl-Dickstein,et al. A Mean Field Theory of Batch Normalization , 2019, ICLR.
[30] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[31] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[32] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[33] Wei Xiong,et al. Regularizing Deep Convolutional Neural Networks with a Structured Decorrelation Constraint , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[34] Hermann Ney,et al. Mean-normalized stochastic gradient for large-scale deep learning , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Tomaso A. Poggio,et al. Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning , 2016, ArXiv.
[36] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[37] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.
[38] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[39] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[40] Lior Wolf,et al. Regularizing by the Variance of the Activations' Sample-Variances , 2018, NeurIPS.
[41] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[42] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[43] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[44] Lei Huang,et al. Iterative Normalization: Beyond Standardization Towards Efficient Whitening , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[46] Ross B. Girshick,et al. Reducing Overfitting in Deep Networks by Decorrelating Representations , 2015, ICLR.
[47] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[48] Ohad Shamir,et al. Failures of Gradient-Based Deep Learning , 2017, ICML.
[49] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[50] Frederick R. Forst,et al. On robust estimation of the location parameter , 1980 .
[51] Jiri Matas,et al. All you need is a good init , 2015, ICLR.
[52] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[53] Tengyu Ma,et al. Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.
[54] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[55] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[56] Brian McWilliams,et al. The Shattered Gradients Problem: If resnets are the answer, then what is the question? , 2017, ICML.
[57] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[58] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[59] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[60] Elad Hoffer,et al. Norm matters: efficient and accurate normalization schemes in deep networks , 2018, NeurIPS.