Ray RLlib: A Framework for Distributed Reinforcement Learning
暂无分享,去创建一个
Michael I. Jordan | Ion Stoica | Joseph E. Gonzalez | Ken Goldberg | Roy Fox | Richard Liaw | Eric Liang | Robert Nishihara | Philipp Moritz | Philipp Moritz | Roy Fox | Ken Goldberg | Robert Nishihara | Eric Liang | I. Stoica | Richard Liaw | Joseph Gonzalez
[1] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[2] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[3] Michael I. Jordan,et al. Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.
[4] Benjamin Hindman,et al. Composing parallel software efficiently with lithe , 2010, PLDI '10.
[5] Anthony Skjellum,et al. A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..
[6] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[7] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[8] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[9] Goetz Graefe,et al. Encapsulation of Parallelism and Architecture-Independence in Extensible Database Query Execution , 1993, IEEE Trans. Software Eng..
[10] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[11] Wilson C. Hsieh,et al. Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.
[12] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.
[13] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[14] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[17] Ameet Talwalkar,et al. Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization , 2016, ICLR.
[18] James Davidson,et al. TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow , 2017, ArXiv.
[19] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.