Learning to Control using Image Feedback

Learning to control complex systems using nontraditional feedback, e.g., in the form of snapshot images, is an important task encountered in diverse domains such as robotics, neuroscience, and biology (cellular systems). In this paper, we present a two neural-network (NN)-based feedback control framework to design control policies for systems that generate feedback in the form of images. In particular, we develop a deep Q-network (DQN)-driven learning control strategy to synthesize a sequence of control inputs from snapshot images that encode the information pertaining to the current state and control action of the system. Further, to train the networks we employ a direct error-driven learning (EDL) approach that utilizes a set of linear transformations of the NN training error to update the NN weights in each layer. We verify the efficacy of the proposed control strategy using numerical examples.

[1]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[2]  Marco Wiering,et al.  Q-learning with experience replay in a dynamic environment , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[3]  S. Jagannathan,et al.  Direct Error Driven Learning for Deep Neural Networks with Applications to Bigdata , 2018, INNS Conference on Big Data.

[4]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[5]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[6]  Tzuu-Hseng S. Li,et al.  Backward Q-learning: The combination of Sarsa algorithm and Q-learning , 2013, Eng. Appl. Artif. Intell..

[7]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[8]  Frank L. Lewis,et al.  Neural Network Control Of Robot Manipulators And Non-Linear Systems , 1998 .

[9]  Kazunori Iwata,et al.  Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[10]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[11]  Dimitri P. Bertsekas,et al.  Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Peter Stone,et al.  A synthesis of automated planning and reinforcement learning for efficient, robust decision-making , 2016, Artif. Intell..

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14]  Junwei Gao,et al.  FMRQ—A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks , 2017, IEEE Transactions on Cybernetics.

[15]  Martin A. Riedmiller,et al.  Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[16]  Peter Stone,et al.  Generalized model learning for reinforcement learning in factored domains , 2009, AAMAS.

[17]  Peter Stone,et al.  Batch reinforcement learning in a complex domain , 2007, AAMAS '07.

[18]  Vignesh Narayanan,et al.  Learning to Control Neurons using Aggregated Measurements , 2020, 2020 American Control Conference (ACC).

[19]  Swagat Kumar,et al.  Balancing a CartPole System with Reinforcement Learning - A Tutorial , 2020, ArXiv.

[20]  Kazunori Iwata Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning. , 2017, IEEE transactions on neural networks and learning systems.

[21]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[22]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[23]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[24]  Ruben Glatt,et al.  MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning , 2019, IEEE Transactions on Cybernetics.

[25]  Jr-Shin Li,et al.  Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment. , 2020, IEEE transactions on cybernetics.