FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib
暂无分享,去创建一个
[1] Hong Zhang,et al. MRapid: An Efficient Short Job Optimizer on Hadoop , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[2] He Huang,et al. A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Tomographic Problems: A Case Study in Seismic Tomography , 2013, ICCS.
[3] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[4] Hong Zhang,et al. Hierarchical Spark: A Multi-Cluster Big Data Computing Framework , 2017, 2017 IEEE 10th International Conference on Cloud Computing (CLOUD).
[5] Rong Zheng,et al. Asynchronous stochastic gradient descent for DNN training , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Heng Huang,et al. Asynchronous Mini-Batch Gradient Descent with Variance Reduction for Non-Convex Optimization , 2017, AAAI.
[7] Thomas Paine,et al. GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training , 2013, ICLR.
[8] Heng Huang,et al. Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization , 2016, AAAI 2016.
[9] John Mark Bishop,et al. Simple adaptive momentum: New algorithm for training multilayer perceptrons , 1994 .
[10] Aaron Q. Li,et al. Parameter Server for Distributed Machine Learning , 2013 .
[11] Stephen J. Wright,et al. An asynchronous parallel stochastic coordinate descent algorithm , 2013, J. Mach. Learn. Res..
[12] He Huang,et al. A model-driven partitioning and auto-tuning integrated framework for sparse matrix-vector multiplication on GPUs , 2011 .
[13] Hong Zhang,et al. Dart: A Geographic Information System on Hadoop , 2015, 2015 IEEE 8th International Conference on Cloud Computing.
[14] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.
[15] Cheng-Hao Tsai,et al. Large-scale logistic regression and linear support vector machines using spark , 2014, 2014 IEEE International Conference on Big Data (Big Data).
[16] Wu-Jun Li,et al. Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee , 2016, AAAI.
[17] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[18] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Hong Zhang,et al. SMARTH: Enabling Multi-pipeline Data Transfer in HDFS , 2014, 2014 43rd International Conference on Parallel Processing.
[21] Nazri Mohd Nawi,et al. An Improved Conjugate Gradient Based Learning Algorithm for Back Propagation Neural Networks , 2008 .
[22] Suyog Gupta,et al. Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study , 2015, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[23] Cho-Jui Hsieh,et al. HogWild++: A New Mechanism for Decentralized Asynchronous Stochastic Gradient Descent , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).
[24] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[25] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.