暂无分享,去创建一个
Chenxi Huang | Xinghao Ding | John Paisley | Yue Huang | Weihong Zeng | Huangxing Lin | J. Paisley | Yue Huang | Xinghao Ding | Huangxing Lin | Weihong Zeng | Yihong Zhuang | Chenxi Huang
[1] Peter Schlicht,et al. Introducing Noise in Decentralized Training of Neural Networks , 2018, DMLE/IOTSTREAMING@PKDD/ECML.
[2] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[3] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[4] Liyuan Liu,et al. On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.
[5] Subhransu Maji,et al. Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.
[6] Quoc V. Le,et al. Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.
[7] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.
[8] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[9] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.
[10] Stephen J. Wright. Coordinate descent algorithms , 2015, Mathematical Programming.
[11] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..
[12] Guosheng Lin,et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.
[16] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.
[17] Qionghai Dai,et al. A PID Controller Approach for Stochastic Optimization of Deep Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[19] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[20] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[22] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[24] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[25] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[26] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[27] Junping Du,et al. Noisy Softmax: Improving the Generalization Ability of DCNN via Postponing the Early Softmax Saturation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.
[29] Xu Sun,et al. Adaptive Gradient Methods with Dynamic Bound of Learning Rate , 2019, ICLR.
[30] H. Robbins. A Stochastic Approximation Method , 1951 .
[31] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[32] Qi Tian,et al. DisturbLabel: Regularizing CNN on the Loss Layer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Ion Necoara,et al. Random Coordinate Descent Algorithms for Multi-Agent Convex Optimization Over Networks , 2013, IEEE Transactions on Automatic Control.
[34] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[35] Guozhong An,et al. The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.
[36] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[37] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[38] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[39] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).