Data-driven heuristic dynamic programming with virtual reality

In this paper, we propose a virtual reality (VR) platform as a case study of machine learning, in this case applied to the goal representation heuristic dynamic programming (GrHDP) approach. In general, a VR platform normally includes a physical module, a control/learning module, and a VR module. It facilitates machine learning research, where scientists and engineers can participate in the simulation process to analyze dynamic experiments. The internal structure of the VR platform can be replaced according to different research targets, so the platform can be extended to other applications. In this paper, we present the detailed VR design strategy, with a number of applications, including a triple-link inverted pendulum balancing problem, a maze navigation problem, and a robot navigation with obstacle avoidance.

[1]  PrendingerHelmut,et al.  Tokyo Virtual Living Lab , 2013 .

[2]  Frederick P. Brooks What's Real About Virtual Reality? , 1999, IEEE Computer Graphics and Applications.

[3]  Frank L. Lewis,et al.  Learning and Optimization in Hierarchical Adaptive Critic Design , 2013 .

[4]  I. Parberry,et al.  Optimal Path Planning for Mobile Robot Navigation , 2008, IEEE/ASME Transactions on Mechatronics.

[5]  Haibo He,et al.  A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.

[6]  Amit Konar,et al.  A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[8]  Dimitri P. Bertsekas,et al.  Missile defense and interceptor allocation by neuro-dynamic programming , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[9]  Guang Yang,et al.  Sparse-Representation-Based Classification with Structure-Preserving Dimension Reduction , 2014, Cognitive Computation.

[10]  Silvia Comani,et al.  A Post-Stroke Rehabilitation System Integrating Robotics, VR and High-Resolution EEG Imaging , 2013, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[11]  Haibo He,et al.  Heuristic dynamic programming with internal goal representation , 2013, Soft Comput..

[12]  Rikk Carey The Virtual Reality Modeling Language Explained , 1998, IEEE Multim..

[13]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[14]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[15]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[16]  Haibo He,et al.  Real-time tracking on adaptive critic design with uniformly ultimately bounded condition , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[17]  C. Y. Kuo,et al.  Real time stabilisation of a triple link inverted pendulum using single control input , 1997 .

[18]  Randall W. Hill,et al.  Virtual Humans in the Mission Rehearsal Exercise System , 2003, Künstliche Intell..

[19]  Haibo He,et al.  Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Mark Lehto,et al.  A review of: “Virtual Reality Technology” Grigore Burdea and Philippe Coiffet John Wiley & Sons, Inc., 1994 , 1996 .

[21]  Wail Gueaieb,et al.  An Intelligent Mobile Robot Navigation Technique Using RFID Technology , 2008, IEEE Transactions on Instrumentation and Measurement.

[22]  Jinyu Wen,et al.  Energy-Storage-Based Low-Frequency Oscillation Damping Control Using Particle Swarm Optimization and Heuristic Dynamic Programming , 2014, IEEE Transactions on Power Systems.

[23]  Paul Pauli,et al.  Virtual Reality in Psychotherapy , 2015 .

[24]  Guang Yang,et al.  L 1 Graph Based on Sparse Coding for Feature Selection , 2013, ISNN.

[25]  P T Nakamoto,et al.  An Virtual Environment Learning of Low Cost for the Instruction of Electric Circuits , 2010, IEEE Latin America Transactions.

[26]  R. Bellman Dynamic programming. , 1957, Science.

[27]  Jozef Novak-Marcincin Application of the Virtual Reality Modeling Language for Design of Automated Workplaces , 2007 .

[28]  Gianluca De Leo,et al.  Simulation of the biomechanical behavior of the skin in virtual surgical applications by finite element method , 2005, IEEE Transactions on Biomedical Engineering.

[29]  P. Werbos Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities , 2006 .

[30]  Haibo He,et al.  Reactive power control of grid-connected wind farm based on adaptive dynamic programming , 2014, Neurocomputing.

[31]  P.-A. Heng,et al.  A virtual-reality training system for knee arthroscopic surgery , 2004, IEEE Transactions on Information Technology in Biomedicine.

[32]  Philippe Coiffet,et al.  Virtual Reality Technology , 2003, Presence: Teleoperators & Virtual Environments.

[33]  L. Cherroun,et al.  Intelligent systems based on reinforcement learning and fuzzy logic approaches, "Application to mobile robotic" , 2012, 2012 International Conference on Information Technology and e-Services.

[34]  Jinyu Wen,et al.  Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Haibo He,et al.  Multi-machine power system control based on dual heuristic dynamic programming , 2014, 2014 IEEE Symposium on Computational Intelligence Applications in Smart Grid (CIASG).

[36]  Eduardo Zalama Casanova,et al.  Adaptive behavior navigation of a mobile robot , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[37]  Paul J. Werbos,et al.  Foreword: ADP - The Key Direction for Future Research in Intelligent Control and Understanding Brain Intelligence , 2008, IEEE Trans. Syst. Man Cybern. Part B.

[38]  Jeremy N. Bailenson,et al.  Virtual Reality , 2009, J. Media Psychol. Theor. Methods Appl..

[39]  A. Meystel,et al.  Intelligent Systems , 2001 .

[40]  Paul J. Werbos,et al.  2009 Special Issue: Intelligence in the brain: A theory of how it works and how to build it , 2009 .

[41]  J E Deutsch,et al.  Virtual Reality-Based Approaches to Enable Walking for People Poststroke , 2007, Topics in stroke rehabilitation.

[42]  Haibo He,et al.  Reinforcement learning control based on multi-goal representation using hierarchical heuristic dynamic programming , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[43]  Zhen Ni,et al.  Experimental Studies on Data-Driven Heuristic Dynamic Programming for POMDP , 2014 .

[44]  Giuseppe Riva,et al.  Virtual reality: an experiential tool for clinical psychology , 2009 .

[45]  Giuseppe Riva,et al.  Virtual Reality in Psychotherapy: Review , 2005, Cyberpsychology Behav. Soc. Netw..

[46]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[47]  Marc Miska,et al.  An Experimental Space for Conducting Controlled Driving Behavior Studies based on a Multiuser Networked 3D Virtual Environment and the Scenario Markup Language , 2013, IEEE Transactions on Human-Machine Systems.

[48]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[49]  K. G. Eltohamy,et al.  Nonlinear optimal control of a triple link inverted pendulum with single control input , 1998 .

[50]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Zhen Ni,et al.  Learning and control in virtual reality for machine intelligence , 2012, 2012 Third International Conference on Intelligent Control and Information Processing.

[52]  Haibo He,et al.  GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[53]  Helmut Prendinger,et al.  Tokyo Virtual Living Lab: Designing Smart Cities Based on the 3D Internet , 2013, IEEE Internet Computing.

[54]  Haibo He,et al.  DCPE co-training for classification , 2012, Neurocomputing.

[55]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[56]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[57]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[58]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[59]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[60]  Dimitris C. Dracopoulos Robot path planning for maze navigation , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[61]  Michael J. Singer,et al.  Virtual environment cultural training for operational readiness (VECTOR) , 2004, Virtual Reality.

[62]  N. Hari Narayanan,et al.  An Interactive and Intelligent Learning System for Physics Education , 2013, IEEE Transactions on Learning Technologies.

[63]  Simon X. Yang,et al.  Neurofuzzy-Based Approach to Mobile Robot Navigation in Unknown Environments , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).