FPGA architecture for deep learning and its application to planetary robotics

Autonomous control systems onboard planetary rovers and spacecraft benefit from having cognitive capabilities like learning so that they can adapt to unexpected situations in-situ. Q-learning is a form of reinforcement learning and it has been efficient in solving certain class of learning problems. However, embedded systems onboard planetary rovers and spacecraft rarely implement learning algorithms due to the constraints faced in the field, like processing power, chip size, convergence rate and costs due to the need for radiation hardening. These challenges present a compelling need for a portable, low-power, area efficient hardware accelerator to make learning algorithms practical onboard space hardware. This paper presents a FPGA implementation of Q-learning with Artificial Neural Networks (ANN). This method matches the massive parallelism inherent in neural network software with the fine-grain parallelism of an FPGA hardware thereby dramatically reducing processing time. Mars Science Laboratory currently uses Xilinx-Space-grade Virtex FPGA devices for image processing, pyrotechnic operation control and obstacle avoidance. We simulate and program our architecture on a Xilinx Virtex 7 FPGA. The architectural implementation for a single neuron Q-learning and a more complex Multilayer Perception (MLP) Q-learning accelerator has been demonstrated. The results show up to a 43-fold speed up by Virtex 7 FPGAs compared to a conventional Intel i5 2.3 GHz CPU. Finally, we simulate the proposed architecture using the Symphony simulator and compiler from Xilinx, and evaluate the performance and power consumption.

[1]  Qi Yu,et al.  DLAU: A Scalable Deep Learning Accelerator Unit on FPGA , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Jack Lightholder,et al.  Asteroid Origins Satellite (AOSAT) I: An On-orbit Centrifuge Science Laboratory , 2017 .

[3]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[5]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Jekanthan Thangavelautham,et al.  Application of Coarse-Coding Techniques for Evolvable Multirobot Controllers , 2010 .

[7]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[8]  Kiri Wagstaff,et al.  Machine learning in space: extending our reach , 2011, Machine Learning.

[9]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[10]  E. Asphaug,et al.  Asteroid Regolith Mechanics and Primary Accretion Experiments in a Cubesat , 2014 .

[11]  Tara A. Estlin,et al.  AEGIS Automated Science Targeting for the MER Opportunity Rover , 2012, TIST.

[12]  Andrew Putnam Large-scale reconfigurable computing in a microsoft datacenter , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).

[13]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[14]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[15]  S. Jeyanthi,et al.  Implementation of single neuron using various activation functions with FPGA , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[16]  J. Koomey Worldwide electricity used in data centers , 2008 .

[17]  Jekanthan Thangavelautham,et al.  Evolving multirobot excavation controllers and choice of platforms using an artificial neural tissue paradigm , 2009, 2009 IEEE International Symposium on Computational Intelligence in Robotics and Automation - (CIRA).

[18]  Jekanthan Thangavelautham,et al.  Autonomous multirobot excavation for lunar applications , 2017, Robotica.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Jekanthan Thangavelautham,et al.  A Coarse-Coding Framework for a Gene-Regulatory-Based Artificial Neural Tissue , 2005, ECAL.

[21]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[22]  Lei Liu,et al.  FPGA-based Acceleration of Deep Neural Networks Using High Level Method , 2015, 2015 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC).

[23]  Jagath C. Rajapakse,et al.  FPGA Implementations of Neural Networks , 2006 .

[24]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[25]  Jekanthan Thangavelautham,et al.  Tackling Learning Intractability Through Topological Organization and Regulation of Cortical Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Jekanthan Thangavelautham,et al.  Evolving a Scalable Multirobot Controller Using an Artificial Neural Tissue Paradigm , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[27]  Steve Marschner,et al.  Matching Real Fabrics with Micro-Appearance Models , 2015, ACM Trans. Graph..

[28]  Glen Berseth,et al.  Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..