论文信息 - Learning automata with changing number of actions

Learning automata with changing number of actions

A reinforcement scheme that is based on the linear reward-inaction updating algorithm is presented for a learning automaton whose action set changes from instant to instant. A learning automaton using the algorithm is shown to be both absolutely expedient and ε-optimal. The simulation results verify the ε-optimality of the algorithm. The results can be extended to the design of general nonlinear absolutely expedient learning algorithms.

B. R. Harita | M. Thathachar

[1] C. L. Mallows,et al. Individual Choice Behaviour. , 1961 .

[2] M. L. Tsetlin. On the Behavior of Finite Automata in Random Media , 1961 .

[3] S. Lakshmivarahan,et al. Absolutely Expedient Learning Algorithms For Stochastic Automata , 1973 .

[4] S. Lakshmivarahan,et al. Learning Algorithms Theory and Applications , 1981 .

[5] Yann LeCun,et al. Learning on automata networks , 1987 .