Reinforcement Learning by Probability Matching