Distributed Offline Policy Optimization Over Batch Data
暂无分享,去创建一个
[1] Shusen Wang,et al. Federated Reinforcement Learning with Environment Heterogeneity , 2022, AISTATS.
[2] Dae-Hyun Choi,et al. Federated Reinforcement Learning for Energy Management of Multiple Smart Homes With Distributed Energy Resources , 2022, IEEE Transactions on Industrial Informatics.
[3] Sanjeev Arora,et al. Evaluating Gradient Inversion Attacks and Defenses in Federated Learning , 2021, NeurIPS.
[4] Pin-Yu Chen,et al. CAFE: Catastrophic Data Leakage in Vertical Federated Learning , 2021, ArXiv.
[5] Wotao Yin,et al. Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems , 2021, ArXiv.
[6] Nan Jiang,et al. On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction , 2021, AISTATS.
[7] Weishan Zhang,et al. Blockchain-Based Federated Learning for Device Failure Detection in Industrial IoT , 2021, IEEE Internet of Things Journal.
[8] Sergey Levine,et al. COMBO: Conservative Offline Model-Based Policy Optimization , 2021, NeurIPS.
[9] Brendan O'Donoghue,et al. Sample Efficient Reinforcement Learning with REINFORCE , 2020, AAAI.
[10] Emma Brunskill,et al. Provably Good Batch Reinforcement Learning Without Great Exploration , 2020, ArXiv.
[11] Lihong Li,et al. Off-Policy Evaluation via the Regularized Lagrangian , 2020, NeurIPS.
[12] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[13] Yutaka Matsuo,et al. Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization , 2020, ICLR.
[14] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[15] Csaba Szepesvari,et al. On the Global Convergence Rates of Softmax Policy Gradient Methods , 2020, ICML.
[16] Christopher Briggs,et al. Federated learning with hierarchical clustering of local updates to improve training on non-IID data , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).
[17] Eunho Yang,et al. Federated Continual Learning with Weighted Inter-client Transfer , 2020, ICML.
[18] Han Yu,et al. FedCoin: A Peer-to-Peer Payment System for Federated Learning , 2020, Federated Learning.
[19] Bo Dai,et al. GenDICE: Generalized Offline Estimation of Stationary Values , 2020, ICLR.
[20] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..
[21] Ilya Kostrikov,et al. AlgaeDICE: Policy Gradient from Arbitrary Experience , 2019, ArXiv.
[22] Phillip B. Gibbons,et al. The Non-IID Data Quagmire of Decentralized Machine Learning , 2019, ICML.
[23] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[24] S. Kakade,et al. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes , 2019, COLT.
[25] Xin Qin,et al. FedHealth: A Federated Transfer Learning Framework for Wearable Healthcare , 2019, IEEE Intelligent Systems.
[26] Hao Zhu,et al. Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies , 2019, SIAM J. Control. Optim..
[27] Jalaj Bhandari,et al. Global Optimality Guarantees For Policy Gradient Methods , 2019, ArXiv.
[28] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[29] Bo Dai,et al. DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections , 2019, NeurIPS.
[30] S. H. Song,et al. Client-Edge-Cloud Hierarchical Federated Learning , 2019, ICC 2020 - 2020 IEEE International Conference on Communications (ICC).
[31] Yao Liu,et al. Off-Policy Policy Gradient with Stationary Distribution Correction , 2019, UAI.
[32] Marc G. Bellemare,et al. Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift , 2019, AAAI.
[33] Ming Liu,et al. Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems , 2019, IEEE Robotics and Automation Letters.
[34] Martha White,et al. An Off-policy Policy Gradient Theorem Using Emphatic Weightings , 2018, NeurIPS.
[35] Hubert Eichner,et al. Federated Learning for Mobile Keyboard Prediction , 2018, ArXiv.
[36] Qiang Liu,et al. Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation , 2018, NeurIPS.
[37] Martin Jaggi,et al. Sparsified SGD with Memory , 2018, NeurIPS.
[38] Jianyu Wang,et al. Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms , 2018, ArXiv.
[39] Shenghuo Zhu,et al. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning , 2018, AAAI.
[40] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.
[41] Georgios B. Giannakis,et al. LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.
[42] Úlfar Erlingsson,et al. Scalable Private Learning with PATE , 2018, ICLR.
[43] Ameet Talwalkar,et al. Federated Multi-Task Learning , 2017, NIPS.
[44] Lihong Li,et al. Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.
[45] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[46] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[47] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[48] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[49] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[50] Martin J. Wainwright,et al. Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.
[51] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[52] David Haussler,et al. Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.
[53] J. Lamperti. ON CONVERGENCE OF STOCHASTIC PROCESSES , 1962 .
[54] Gangqiang Li,et al. Privacy Protection in Prosumer Energy Management Based on Federated Learning , 2021, IEEE Access.
[55] Byung-Jun Lee,et al. Representation Balancing Offline Model-based Reinforcement Learning , 2021, ICLR.
[56] Sung Ju Hwang,et al. Federated Semi-Supervised Learning with Inter-Client Consistency , 2020, ICLR.
[57] K. Doya. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.