暂无分享,去创建一个
Eric P. Xing | Bryon Aragam | Bingjing Zhang | Aurick Qiao | E. Xing | Aurick Qiao | Bingjing Zhang | Bryon Aragam
[1] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[2] Abutalib Aghayev,et al. Litz: Elastic Framework for High-Performance Distributed Machine Learning , 2018, USENIX Annual Technical Conference.
[3] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[4] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[5] Yuanzhou Yang,et al. Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes , 2018, ArXiv.
[6] Carlos Guestrin,et al. Distributed GraphLab : A Framework for Machine Learning and Data Mining in the Cloud , 2012 .
[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[8] Eric P. Xing,et al. High-Performance Distributed ML at Scale through Parameter Server Consistency Models , 2014, AAAI.
[9] Alexander J. Smola,et al. Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.
[10] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[11] Mark W. Schmidt,et al. Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.
[12] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..
[13] Yurii Nesterov,et al. First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.
[14] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[15] John T. Daly,et al. A higher order estimate of the optimum checkpoint interval for restart dumps , 2006, Future Gener. Comput. Syst..
[16] Seunghak Lee,et al. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.
[17] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[18] Dan Walsh,et al. Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.
[19] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[20] Kenneth Y. Goldberg,et al. Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.
[21] Jun S. Liu,et al. The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .
[22] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[23] Rachid Guerraoui,et al. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.
[24] Jean-Pierre Dussault,et al. A globally convergent algorithm for MPCC , 2015, EURO J. Comput. Optim..
[25] Rachid Guerraoui,et al. Asynchronous Byzantine Machine Learning ( the case of SGD ) Supplementary Material , 2022 .
[26] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[27] Eric P. Xing,et al. Managed communication and consistency for fast data-parallel iterative analytics , 2015, SoCC.
[28] Ohad Shamir,et al. Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.
[29] Lili Su,et al. Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent , 2017, Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems.
[30] Wotao Yin,et al. A Globally Convergent Algorithm for Nonconvex Optimization Based on Block Coordinate Update , 2014, J. Sci. Comput..
[31] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[32] Mahadev Konar,et al. ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.
[33] Seunghak Lee,et al. Exploiting Bounded Staleness to Speed Up Big Data Analytics , 2014, USENIX Annual Technical Conference.
[34] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.
[35] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[36] Alexander Sergeev,et al. Horovod: fast and easy distributed deep learning in TensorFlow , 2018, ArXiv.
[37] Gregory R. Ganger,et al. Proteus: agile ML elasticity through tiered reliability in dynamic resource markets , 2017, EuroSys.
[38] Prashant Malik,et al. Cassandra: a decentralized structured storage system , 2010, OPSR.
[39] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[40] Yoshua Bengio,et al. Low precision arithmetic for deep learning , 2014, ICLR.
[41] Dan Alistarh,et al. ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning , 2017, ICML.
[42] Alexandros G. Dimakis,et al. Gradient Coding: Avoiding Stragglers in Distributed Learning , 2017, ICML.
[43] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.
[44] Suhas N. Diggavi,et al. Straggler Mitigation in Distributed Optimization Through Data Encoding , 2017, NIPS.
[45] Zhe Wang,et al. Efficient top-K query calculation in distributed networks , 2004, PODC '04.
[46] Seunghak Lee,et al. Solving the Straggler Problem with Bounded Staleness , 2013, HotOS.
[47] Nicholas J. Higham,et al. INVERSE PROBLEMS NEWSLETTER , 1991 .
[48] Randy H. Katz,et al. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.
[49] Mei Han An,et al. accuracy and stability of numerical algorithms , 1991 .
[50] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[51] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[52] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[53] Chih-Jen Lin,et al. Field-aware Factorization Machines for CTR Prediction , 2016, RecSys.
[54] Rachid Guerraoui,et al. The Hidden Vulnerability of Distributed Learning in Byzantium , 2018, ICML.
[55] Eric P. Xing,et al. Addressing the straggler problem for iterative convergent parallel ML , 2016, SoCC.
[56] Pengtao Xie,et al. Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters , 2017, USENIX Annual Technical Conference.
[57] Lifeng Lai,et al. On randomized distributed coordinate descent with quantized updates , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).
[58] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[59] Dimitris S. Papailiopoulos,et al. Speeding up distributed machine learning using codes , 2016, ISIT.
[60] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[61] Ken Lang,et al. NewsWeeder: Learning to Filter Netnews , 1995, ICML.
[62] Carlos Maltzahn,et al. Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.