GoSGD: Distributed Optimization for Deep Learning with Gossip Exchange
暂无分享,去创建一个
[1] Stephen P. Boyd,et al. Randomized gossip algorithms , 2006, IEEE Transactions on Information Theory.
[2] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[5] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[8] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[9] David Picard,et al. Decentralized K-Means Using Randomized Gossip Protocols for Clustering Large Datasets , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.
[10] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[11] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[12] David Picard,et al. Asynchronous gossip principal components analysis , 2015, Neurocomputing.
[13] David G. Luenberger,et al. Linear and nonlinear programming , 1984 .
[14] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[16] Johannes Gehrke,et al. Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[17] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[18] He Ma,et al. Theano-MPI: A Theano-Based Distributed Training Framework , 2016, Euro-Par Workshops.
[19] Stéphan Clémençon,et al. Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions , 2016, ICML.
[20] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[21] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[22] Giancarlo Fortino,et al. Epidemic K-Means Clustering , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.
[23] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[24] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.