Learning automata with continuous input and changing number of actions

The behaviour of a stochastic automaton operating in an S-model environment is described. The environment response takes an arbitrary value in the closed segment [0, 1] (continuous response). The learning automaton uses a reinforcement scheme to update its action probabilities on the basis of the reaction of the environment. The complete set of actions is divided into a collection of non-empty subsets. The action set is changing from instant to instant. Each action set is selected according to a given probability distribution. Convergence and convergence rate results are presented. These results have been derived using quasimartingales theory.