A Master Equation Formulation of the Reinforcement Scheme of Stochastic Learning Automata

we formulate the learning scheme in terms of a discrete Markov process , and transform its equation into a continuous time master equation. By making a small perturbation for a small learning parameter , we derive a small perturbation expansion of the master equation to get a Fokker 司 Planckequation approximation with the low-order of the learning parameters. In here , we show that the global features of reinforcement scheme of learning automata can be described within this approximation due to the fact that the deterministic term of the dynamics has a globally asymptotically stable fixed poin t.