Action selection and learning in multi-agent environments