QTAccel: A Generic FPGA based Design for Q-Table based Reinforcement Learning Accelerators
暂无分享,去创建一个
Yuan Meng | Ajitesh Srivastava | Rajgopal Kannan | Viktor Prasanna | Sanmukh Kuppannagari | Rachit Rajat
[1] Zengshi Chen,et al. Reinforcement Learning: An Introduction: R.S. Sutton, A.G. Barto, MIT Press, Cambridge, MA 1998, 322 pp. ISBN 0-262-19398-1 , 2000, Neurocomputing.
[2] Apostolos Burnetas,et al. Optimal Adaptive Policies for Markov Decision Processes , 1997, Math. Oper. Res..
[3] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Pranay Reddy Gankidi. FPGA Accelerator Architecture for Q-learning and its Applications in Space Exploration Rovers , 2016 .
[6] Rocco Fazzolari,et al. An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm , 2019, IEEE Access.
[7] H Robbins,et al. Sequential choice from several populations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.
[8] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[9] Arjun Chandra,et al. Efficient Parallel Methods for Deep Reinforcement Learning , 2017, ArXiv.
[10] Viktor K. Prasanna,et al. Energy performance of FPGAs on PERFECT suite kernels , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).
[11] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[12] Jekan Thangavelautham,et al. FPGA architecture for deep learning and its application to planetary robotics , 2017, 2017 IEEE Aerospace Conference.
[13] Jaejin Lee,et al. FA3C: FPGA-Accelerated Deep Reinforcement Learning , 2019, ASPLOS.
[14] Yuan Meng,et al. QTAccel: A Generic FPGA based Design for Q-Table based Reinforcement Learning Accelerators , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[15] Yantao Shen,et al. A Comparison of Various Approaches to Reinforcement Learning Algorithms for Multi-robot Box Pushing , 2018, Advances in Engineering Research and Application.
[16] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[17] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[18] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[19] Tim Güneysu,et al. Towards Efficient Arithmetic for Lattice-Based Cryptography on Reconfigurable Hardware , 2012, LATINCRYPT.
[20] Yuxi Li,et al. Deep Reinforcement Learning: An Overview , 2017, ArXiv.
[21] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[22] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[23] Elwin Chandra Monie,et al. Hardware Architecture of Reinforcement Learning Scheme for Dynamic Power Management in Embedded Systems , 2007, EURASIP J. Embed. Syst..
[24] Marcelo A. C. Fernandes,et al. Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA , 2019, IEEE Access.
[25] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..
[26] Setareh Maghsudi,et al. Multi-armed bandits with application to 5G small cells , 2015, IEEE Wireless Communications.
[27] David B. Thomas,et al. Neural Network Based Reinforcement Learning Acceleration on FPGA Platforms , 2017, CARN.
[28] Frederik Vercauteren,et al. High Precision Discrete Gaussian Sampling on FPGAs , 2013, Selected Areas in Cryptography.
[29] Kapil R. Dandekar,et al. Learning State Selection for Reconfigurable Antennas: A Multi-Armed Bandit Approach , 2014, IEEE Transactions on Antennas and Propagation.
[30] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .