Adaptive Reinforcement Learning Method for Sequential Decision Task : A Review

There are many dynamic situations in which sequential actions come with circumstances favorable. These consequences of actions can include at a multitude of times after the action is taken, and it shall be concern with the strategies for specify action on the basis of both their short term and long term consequences. A proposed model based approach which requires constructing the model of state transaction and payoff probabilities. Task of such kind can be termed as a dynamical system whose behavior changes over time under the impact of a decision maker’s action. This modeling of the behavior of the system is greatly simplified by the concept of state. Decision policy associates on action with each system states. There is a great practical importance of adaptive method, if this adaptive method can make improvement in decision policy sufficiently rapidly may be less. It proposes methods for estimating optimal policy in the absence of a complete model of the decision tasks which are known as adaptive or decision model.

[1]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[2]  Bart De Schutter,et al.  Consistency of fuzzy model-based reinforcement learning , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[3]  Robert Babuska,et al.  Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Pedro Ferreira,et al.  An MDP Model-Based Reinforcement Learning Approach for Production Station Ramp-Up Optimization: Q-Learning Analysis , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Parag Kulkarni,et al.  Learning and Sequential Decision Making For Medical Data Streams Using Rl Algorithm , 2013 .

[6]  Grigorios Tsoumakas,et al.  Pruning an ensemble of classifiers via reinforcement learning , 2009, Neurocomputing.

[7]  Larry D. Pyeatt,et al.  Decision Tree Function Approximation in Reinforcement Learning , 1999 .

[8]  José Carlos Príncipe,et al.  Correntropy kernel temporal differences for reinforcement learning brain machine interfaces , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[9]  Mircea Preda,et al.  Adaptive building of decision trees by reinforcement learning , 2007 .